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^ (57) Abstract: The present invention provides novel nucleic acids, novel polypeptide sequences encoded by these nucleic acids and 
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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by 
such polynucleotides, along witti uses for these polynucleotides and proteins, for example 
in therapeutic, diagnostic and research methods. 

2. BACKGROUND 

Technology aimed at the discovery of protein factors Oncluding e.g., cytokines, 
such as lympholdnes, interferons, CSFs, chemoldnes, and interleukins) has matured 
rapidly over the past decade. The now routine hybridization cloning and expression 
cloning techniques clone novel polynucleotides "directiy" in the sense that tiiey rely on 
infonnation directly related to the discovered protein (i.e., partial DNA/amino acid 
sequence of tht protein in the case of hybridization cloning; activity of the protein in the 
case of expression cloning). More recent "indirect" cloning techniques such as signal 
sequence cloning, which isolates DNA sequences based on the presence of a now 
weU-iecognized secretory leader sequence motif, as well as various PCR-based or low 
stringency hybridization-based cloning techniques, have advanced tiie state of tiie art by 
making available large numbers of DNA/amino acid sequences for proteins that are 
known to have biological activity, for example, by virtue of their secreted nature in tiie 
case of leader sequence cloning, by virtue of tiieir cell or tissue source in the case of 
picR-based techniques, or by virtue of structural similarity to other genes of known 
biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications 
in, for example, diagnostics, forensics, gene moping; identification of mutatiras 
responsible for genetic disorders or other tiraits, to assess biodiversity,. and to produce 
many other types of data and products dependent on DNA and amino add sequences. 



3. SUMMARY OF THE INVENTION 

The compositicms of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA 
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molecules, cloned genes or degenerate variants thereof, especiaUy naturally occurring 
variants such as allelic variants, antisense polynucleotide molecules, and antibodies that 
specifically recognize one or more epitopes present on such polypeptides, as well as 
hybridomas producing such antibodies. 
5 The conopositions of the present invention additionally include vectors, including 

expression vectors, containing the polynucleotides of the invention, cells genetically 
en^neered to contain such polynucleotides and cells genetically engaieered to express such 
polynucleotides. 

The present invention relates to a collection or library of at least one novel nucleic 
10 acid sequence assembled from expressed sequence tags (ESTs) isolated mainly by 

sequencing by hybridization (SBH), and in some cases, sequences obtained from one or 
more public databases. The invention relates also to the proteins encoded by such 
polynucleotides, along with therapeutic, diagnostic and research utilities for these 
polynucleotides and proteins. These nucleic acid sequences are designated as SEQ ID NO: 
15 1 - 438 and are provided in the Sequence Usting. In the nucleic acids provided in the 
Sequence Listing, A is adenine; C is cytosine; G is guanine; T is thymine; and N is any of 
the four bases. In the amino acids provided in the Sequence Listing, * corresponds to the 
stopcodon. 

The nucleic acid sequences of die present invention also include, nucleic acid 
20 sequences that hybridize to the complement of SEQ ID NO: 1 - 438 under stringent 
hybridization conditions; nucleic acid sequences which are allelic variants or species 
homologues of any of the nucleic acid sequences recited above, or nucleic acid sequences 
that encode a peptide comprising a specific domain or truncation of the peptides encoded by 
SEQ ID NO: 1 - 438. A polynucleotide comprising a nucleotide sequence haying at least 
25 90% identity to an identifying sequence of SEQ ID NO: 1 - 438 or a degenerate variant or 
ftagment thereof. The identifying sequence can be 100 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence 
information firom the nucleic acid sequences of SEQ ID NO: 1 - 438. The sequence 
information can be a segment of any one of SEQ ID NO: 1 - 438 that uniquely identifies or 
30 represents the sequence information of SEQ ID NO: 1 - 438. 
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A collection as used in this application can be a collection of only one 
polynucleotide. The collection of sequence information or identifying information of each 
sequence can be provided on a nucleic acid array. In one embodiment, segments of 
sequence information is provided on a nucleic acid array to detect tiie polynucleotide that 

5 contains the segment. The array can be designed to detect full-nxatch or misniatch to the 
polynucleotide that contains the segment. Hie collection can also be provided in a 
computer-readable format 

Hiis invaition also includes the reverse or direct complemMit of any of the nucleic 
acid sequences recited above; cloning or expression vectors containing the nucleic add 

10 sequences; and host cells or organisnns transformed with these expression vectors. Nucleic 
acid sequences (or their reverse or direct complements) according to the invention have 
numerous applications in a variety of techniques known to those skilled in the art of 
molecular biology, such as use as hybridization laobes, use as primers f» PGR, use in an 
array, use in computer-readable media, use in sequencing fiill-lengtti genes, use for 

15 chromosome and gene mapping, use in the recombinant production of protein, and use in 
the generation of anti-sense DNA or RNA, tiieir chemical analogs and die like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1-438 or 
novel segments or parts of the nucleic acids of the invention are used as primers in 
expression assays tiiat are well known in the art. In a particularly preferred embodiment, die 

20 nucleic acid sequences of SEQ ID NO: 1-438 or novel segments or parts of the nucleic acids 
provided herein are used in diagnostics for identifying expressed genes or, as well known in 
the art and exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence 
tags for physical mapping of the humian genome. 

The isolated polynucleotides of tiie invention include, but are not limited to, a 

25 polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1- 
438; a polynucleotide comprising any of ttie full l«igth protein coding sequences of SEQ ID 
NO: 1-438; and a polynucleotide comprising any of the nucleotide seqirences of the mature 
protein coding sequences of SEQ ID NO: 1-438. The polynucleotides of the present 
invention also include, but are not limited to, a polynucleotide that hybridizes under 

30 stringent hybridization conditions to (a) flie conq)lement of any one of the nucleotide 

sequences set fortii in SEQ ID NO: 1-438; (b) a nucleotide sequence encoding any one of 
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the amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an 
allelic variant of any polynucleotides recited above; (d) a polynucleotide which encodes a 
species homolog (e.g. orthologs) of any of the proteins recited above; or (e) a 
polynucleotide that encodes a polypeptide comprising a specific domain or truncation of any 

5 of the polypeptides comprising an amino acid sequence set forth in the Sequence listing. 
The isolated polypeptides of the invention include, but are not limited to, a 
polypeptide comprising any of the amino acid sequences set forth in the Sequence listing; 
or the corresponding fiiU length or mature protdn. Polypeptides of the invention also include 
polypeptides with biological activity that are encoded by (a) any of ttie polynucleotides 

10 having a nucleotide sequence set forth in SEQ ID NO: 1-438; or (b) polynucleotides that 
hybridize to the complement of the polynucleotides of (a) under stringent hybridization 
conditions. Biologically or immunologically active variants of any of the polypeptide 
sequences in the Sequence listing, and "substantial equivalents" thereof (e.g., with at least 
about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% amino acid sequence identity) 

15 that preferably retain biological activity are also contemplated. The polypeptides of the 

invention may be whoUy or partially chemically synthesized but are preferably produced by 
recombinant means using the genetically engineered cells (e.g. host cells) of the invention. 

The invention also provides compositions comprising a polypeptide of the 
invention. Polypeptide compositions of the invention may further comprise an acceptable 

20 carrier, such as a hydrophilic, e.g., pharmaceutically acceptable, carrier. 

The invention also provides host cells transfonned or transfected with a 
polynucleotide of the invention. 

Hie invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture 

25 medium under conditions permitting expression of the desired polypeptide, and purifying 
the polypeptide from the culture or from the host cells. Preferred embodiments include 
those in which the protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a 
variety of techniques known to those skilled in the art of molecular biology. These 

30 techniques include use as hybridization probes, use as oligomers, or primers, for PGR, 
use for chromosome and gene mapping, use in the recombinant production of protein, 
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and use in generation of anti-sense DNA or RNA, their chemical analogs and the like. 
For example, when the expression of an mORNA is largely restricted to a particular cell or 
tissue type, polynucleotides of the invention can be used as hybridization probes to detect 
the presence of the particular cell or tissue mRNA in a sample using, e.g,, in situ 
5 hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by VoUrath et al., Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 

10 The polypeptides according to the invention can be used in a variety of 

conventional procedures and methods that are currently applied to other proteins. For 
example, a polypeptide of the invention can be used to generate an antibody that 
specifically binds the polypeptide. Such antibodies, particularly monoclonal antibodies, 
are useful for detecting or quantitating the polypeptide in tissue. The polypeptides of the 

15 invention can also be used as molecular weight markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical 
condition which comprises the step of administering to a mammalian subject a 
therapeutically effective amount of a composition comprising a polypeptide of the 
present invention and a phannaceutically acceptable carrier. 

20 In particular, the polypeptides and polynucleotides of the invention can be 

utilized, for example, in methods for the prevention and/or treatment of disorders 
involving abrarant protein expression or biological activity. 

The present invention further relates to methods for detecting tfie presence of the 
polynucleotides or polypeptides of the invention in a sample. Such methods can, for 

25 example, be utilized as part of prognostic and diagnostic evaluation of disorders as 
xecited herein and for the identification of subjects exhibiting a predisposition to such 
conditions. The invention provides a method for detecting the polynucleotides of the 
invention in a sample, comprising contacting the sample with a compound that binds to 
and forms a complex with the polynucleotide of interest for a period sufficient to fonn 

30 the complex and under conditions sufficient to form a complex and detecting the complex 
such that if a complex is detected, the polynucleotide of interest is detected. The 
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invention also provides a method for detecting the polypeptides of the invention in a 
sample comprising contacting tiie sample witii a compound that binds to and forms a 
complex witii the polypeptide under conditions and for a period sufficient to form the 
complex and detecting the formation of the conq>lex such tiiat if a complex is formed, tiie 

5 polypeptide is detected. 

The invention also provides kits comprising polynucleotide probes and/or 
monoclonal antibodies, and optionally quantitative standards, for carrying out methods of 
the invention. Furthermore, tiie invention provides methods for evaluating tiie efficacy of 
dnigs, and monitoring the progress of patients, involved in clinical trials for tiie treatment 

10 of disordos as recited above. 

The invention also provides metiiods for tiie identification of compounds tiiat 
modulate (i.e., increase or decrease) tiie expression or activity of tiie polynucleotides 
and/or polypeptides of tiie invention. Such methods can be utilized, for example, for tiie 
identification of compounds that can ameliorate symptoms of disorders as recited herein. 

15 Such metiiods can include, but are not Umited to. assays for identifying compounds and 
otiier substances that interact with (e.g. , bind to) tiie polypeptides of the invention. The 
invention provides a method for identifying a compound tfiat binds to tiie polypeptides of 
tiie invention comprising contacting tfie compound witii a polypeptide of tfie invention in 
a cell for a time sufficient to form a polypeptide/compound complex, wherein ttie 

20 complex drives expression of a reporter gene sequence in the cell; and detecting tiie 

complex by detecting the reporter gene sequence expression such tiiat if expression of tiie 
rcpo^r gene is detected tiie cranpound flie binds to a polypeptide of tiie invention is 
identified. 

The metiiods of ttie invention also provides metiiods for tijeatinwit which involve 
25 tiie administration of tiie polynucleotides or polypeptides of tiie invention to individuals 
exhibiting symptoms or tendencies. In addition, tiie invention aicompasses metiiods for 
treating diseases or disorders as recited herein comprising administering compounds and 
ottier substances tiiat modulate tiie overall activity of the target gene products. 
Compounds and otiier substances can effect such modulation eitiier on tiie level of target 
30 gene/protein expression or target proton activity. 
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The polypeptides of the present invention and the polynucleotides encoding them 
are also useful for the same functions known to one of skill in the art as the polypeptides 
and polynucleotides to which they have homology (set forth in Table 2); for which they 
have a signature region (as set forth in Table 3); or for which they have homology to a 
5 gene family (as set forth in Table 4). If no homology is set forth for a sequence, then the 
polypeptides and polynucleotides of the present invention are usefid for a variety of 
i applications, as described herein, including use in arrays for detection. 

4. DETAILED DESCRIPTION OF THE INVENTION 

10 

4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular 
forms "a", "an" and "the" include plural references unless the context clearly dictates 
otherwise. 

15 The term "active" refers to those forms of the polypeptide which retain the 

biologic and/or immunologic activities of any naturally occurring polypeptide. According 
to the invention, the terms "biologically active" or "biological activity" refer to a protein 
or peptide having structural, regulatory or biochemical functions of a naturally occurring 
molecule. Likewise "irmnunologically active" or "inmiunological activity" refers to the 

20 capability of the natural, recombinant or synthetic polypeptide to induce a specific 
inmiune response in appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are 
engaged in extracellular or intracellular membrane trafficking, including the export of 
secretory or enzymatic molecules as part of a normal or disease process. 

25 The terms "complementary" or "complementarity" refer to the natural binding of 

polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 
complementary sequence 3'-TCA-5\ Complementarity between two single-stranded 
molecules may be "partial" such that only some of the nucleic acids bind or it may be 
"complete" such that total complementarity exists between the single stranded molecules. 

30 The degree of complementarity between the nucleic acid strands has significant effects on 
the efficiency and strength of the hybridization between the nucleic acid strands. 
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The terai "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term 
"germ line stem cells (GSCs)" refers to stem cells derived from primordial stem cells that 
provide a steady and continuous source of germ cells for the production of gametes. The 
S tenn "primordial germ cells (PGCs)"' refers to a small population of cells set aside from 
other cell lineages particularly from the yolk sac, mesenteries, or gonadal ridges during 
embryogenesis that have the potential to differentiate into germ cells and other cells. 
PGCs are the source from which GSCs and ES cells are derived Hie PGCs, the GSCs 
and the ES cells are capable of self-renewal. Thus these cells not only populate the germ 

10 line and give rise to a plurality of terminally differentiated cells that comprise the adult 
specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides 
which modulates the expression of an operably linked ORE or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably 

15 linked sequence" when the expression of the sequence is altered by the presence of the 
EMF. EMFs include, but are not limited to, promoters, and promoter modulating 
sequences (inducible elements). One class of EMFs are nucleic acid fragments which 
induce the expression of an operably linked ORF in response to a specific regulatory 
factor or physiological event 

20 The terms "nucleotide sequence" or "nucleic acid" or **polynucleotide" or 

"oligonculeotide" are used interchangeably and lefer to a heteropolymer of nucleotides or 
the sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic 
or synthetic origin which may be single-stranded or double-stranded and may represent 
the sense or the antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or 

25 RNA-like material. In the sequences herein A is adenine, C is cytosine, T is thymine, G 
is guanine and N is A, C, G or T (U). It is contemplated that where the polynucleotide is 
RNA, the T (thymine) in the sequences provided herein is substituted with U (uracil). 
Generally, nucleic acid segments provided by this invention may be assembled from 
fragments of the genome and short oligonucleotide linkers, or from a series of 

30 oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic acid 
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which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," 
or "segment" or **probe" or "primef are used interchangeably and refer to a sequence of 
S nucleotide residues which are at least about 5 nucleotides, more preferably at least about 
7 nucleotides, more preferably at least about 9 nucleotides, more preferably at least about 
11 nucleotides and most preferably at least about 17 nucleotides. The fragment is 
preferably less than about 500 nucleotides, preferably less than about 200 nucleotides, 
more preferably less than about 100 nucleotides, more preferably less than about SO 

10 nucleotides and most preferably less than 30 nucleotides. Preferably the probe is from 
about 6 nucleotides to about 200 nucleotides, preferably from about 15 to about 50 
nucleotides, more preferably from about 17 to 30 nucleotides and most preferably from 
about 20 to 25 nucleotides. Preferably the fragments can be used in polymerase chain 
reaction (PGR), various hybridization procedures or microarray procedures to identify or 

15 amplify identical or related parts of mRNA or DNA molecules. A fragment or segment 
may uniquely identify each polynucleotide sequence of the present invention. Preferably 
the fragment comprises a sequence substantially similar to any one of SEQ ID NOs:l- 
438. 

Probes may, for example, be used to determine whether specific mRNA 
20 molecules are present in a cell or tissue or to isolate similar nucleic acid sequences from 
chromosomal DNA as described by Walsh et al. (Walsh, RS. et al., 1992, PGR Methods 
Appl 1:241-250). They may be labeled by nick translation, Klenow fiU-in reaction, PGR, 
or other methods well known in the art. Probes of the present invention, their preparation 
and/or labeling are elaborated in Sambrook, J. et al., 1989, Molecular Cloning: A 
25 Laboratory Manual, Cold Spring Harbor Laboratory, NY; or Ausubel, FM. et al., 1989, 
Current Protocols in Molecular Biology, John Wiley & Sons, New York NY, both of 
which are incoiporated herein by reference in their entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NOs: 1-438. The.sequence 
30 information can be a segment of any one of SEQ ID NOs: 1-438 that uniquely identifies 
or represents the sequence information of that sequence of SEQ ID NO; 1-438. One such 
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segment can be a twenty-mer nucleic acid sequence because the probability that a twenty- 
mer is fully matched in the human genome is 1 in 300. In the human genome, there are 
three billion base pairs in one set of chromosomes. Because 4^*^ possible twenty-mers 
exist, there are 300 times more twenty-mers than there are base pairs in a set of human 
5 chromosomes. Using the same analysis, the probability for a seventeen-mer to be fully 
matched in the human genome is approximately 1 in 5. When these segments are used in 
arrays for expression studies, fifteen-mer segments can be used. The probability that the 
jSfteen-mer is fully matched in the expressed sequences is also approximately one in five 
because expressed sequences comprise less than approximately 5% of the entire genome 
10 sequence. 

Similarly, when using sequence information for detecting a single mismatch, a 
segment can be a twenty-five mer. The probability that die twenty-five mer would appear in 
a human genome with a single mismatch is calculated by multiplying the probability for a 
fiill match (1-^4^) times the increased probability for mismatch at each nucleotide position 

15 (3 X 25). The probability that an eighteen mer with a single mismatch can be detected in an 
array for expression studies is approximately one in five. The probability that a twenty-mer 
with a single mismatch can be detected in a human genome is approximately one in five. 

The term "open reading frame," ORF, means a series of nucleotide triplets coding 
for amino acids without any termination codons and is a sequence translatable into 

20 protein. 

The terms "operably linked" or "operably associated" refer to functionally related 
nucleic acid sequences. For example, a promoter is operably associated or operably 
linked with a coding sequence if the promoter controls the transcription of the coding 
sequence. While operably linked nucleic acid sequences can be contiguous and in the 
25 same reading firame, certain genetic elements e.g. repressor genes are not contiguously 
linked to the coding sequence but still control transcription/translation of the coding 
sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a 
number of differentiated cell types that are present in an adult organism. A pluripotent 
30 cell is restricted in its differentiation capability in comparison to a totipotent cell. 
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The tenns "polypeptide" or "peptide" or "amino acid sequence" refer to an 
oligopeptide, peptide, polypeptide or protein sequence or fragment thereof and to 
naturally occurring or synthetic molecules. A polypeptide "fragment," "portion," or 
"segment" is a stretch of amino acid residues of at least about 5 amino acids, preferably at 
S least about 7 amino acids, more preferably at least about 9 amino acids and most 

preferably at least about 17 or more amino adds. Hie peptide preferably is not greater 
than about 200 amino acids, more preferably less than 150 amino acids and most 
preferably less than 100 amino acids. Preferably the peptide is from about 5 to about 200 
amino acids. To be active, any polypeptide must have sufficient length to display 

10 biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by 
cells that have not been genetically engineered and specifically contemplates various 
polypeptides arising from post-translational modifications of the polypeptide including, 
but not limited to, acetylation, carboxylation, glycosylation, phosphorylation, lipidation 

15 and acylation. 

The term 'translated protein coding portion" means a sequence which encodes for 
the fiill length protein which may include any leader sequence or any processing 
sequence. 

The term **mature protein coding sequence" means a sequence which encodes a 
20 peptide or protein without a signal or leader sequence. The "mature protein portion" 

means that portion of the protein which does not include a signal or leader sequence. Tht 
peptide may have been produced by processing in the cell which removes any 
leader/signal sequence. The mature protein portion may or may not include the initial 
methionine residue. The methionine residue may be removed from the protdn during 
25 processing in the cell. The peptide may be produced synthetically or the protein may 
have been produced using a polynucleotide only encoding for the mature protein coding 
sequence. 

The term "derivative" refers to polypeptides chemically modified by such 
techniques as ubiquitination, labeling (e.g., with radionuclides or various enzymes), 
30 covalent polymer attachment such as pegylation (derivatization with polyethylene glycol) 
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and insertion or substitution by chemical synthesis of aniino acids such as ornithine, 
which do not normally occur in human proteins. 

The term "variant'Xor "analog") refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created 

5 using, e g., recombinant DNA techniques. Guidance in determining which amino acid 
residues may be replaced, added or deleted without abolishing activities of interest, may 
be found by comparing the sequence of the particular polypeptide with that of 
homologous peptides and minimizing the number of amino acid sequence changes made 
in regions of high homology (conserved regions) or by replacing amino acids with 

10 consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides 
may be synthesized or selected by making use of the "redundancy" in the genetic code. 
Various codon substitutions, such as the silent changes which produce various restriction 
sites, may be introduced to optimize cloning into a plasmid or viral vector or expression 

15 in a particular prokaryotic or eukaryotic system. Mutations in the polynucleotide 

sequence may be reflected in the polypeptide or domains of other peptides added to the 
polypeptide to modify the properties of any part of the polypeptide, to change 
characteristics such as hgand-binding affinities, interchain affinities, or 
degradation/turnover rate. 

20 Preferably, amino acid "substitutions" are the result of replacing one amino acid 

with another amino acid having similar stractural and/or chemical properties, 
conservative amino acid replacements. "Conservative" amino acid substitutions may be 
made on the basis of similarity in polarity, charge, solubility, hydrophobicity, 
hydiophilicity, and/or the amphipathic nature of the residues involved. For example, 

25 nonpolar Oiydrophobic) amino acids include alanine, leucine, isoleucine, valine, proline, 
phenylalanine, tryptophan, and methionine; polar neutral amino acids include glycine, 
serine, threonine, cysteine, tyrosine, asparagine, and glutamine; positively charged (basic) 
amino acids include arginine, lysine, and histidine; and negatively charged (acidic) amino 
acids include aspartic acid and glutamic acid. "Insertions" or "deletions" are preferably 

30 in the range of about 1 to 20 amino acids, more preferably 1 to 10 amino acids. The 

variation allowed may be experimentally determined by systematically making insertions, 
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deletions, or substitutions of amino acids in a polypeptide molecule using recombinant 
DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such 
alterations can, for example, alter one or more of the biological functions or biochemical 
characteristics of the polypeptides of tiie invention. For example, such alterations may 
change polypeptide characteristics such as ligand-binding affinities, interchain affinities, 
or degradation/turnover rate. Further, such alterations can be selected so as to generate 
polypeptides tiiat are better suited for expression, scale up and the like in the host cells 
chosen for expression. For example, cysteine residues can be deleted or substituted with 
another amino acid residue in order to eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes tiiat the 
indicated nucleic acid or polypeptide is present in the substantial absence of oflier 
biological macromolecules, e.g., polynucleotides, proteins, and the like. In one 
embodiment, the polynucleotide or polypeptide is purified such that it constitiites at least 
95% by weight, more preferably at least 99% by weight, of the indicated biological 
macromolecules present (but water, buffers, and otiier small molecules, especially 
molecules having a molecular weight of less than 1000 daltons, can be present). 

The term "isolated" as used herein refers to a nucleic acid or polypeptide 
separated from at least one other component (e.g., nucleic acid or polypeptide) present 
with the nucleic acid or polypeptide in its natural source. In one embodiment, the nucleic 
acid or polypeptide is found in the presence of (if anytiiing) only a solvent, buffer, ion, or 
other component normally present in a solution of the same. The terms "isolated" and 
"purified" do not encompass nucleic acids or polypeptides present in their natural source. 

The term "recombinanC when used herein to refer to a polypeptide or protein, 
means that a polypeptide or protein is derived from recombinant {e.g., microbial, insect, 
or mammalian) expression systems. "Microbial" refers to recombinant polypeptides or 
proteins made in bacterial or fungal (e.g., yeast) expression systems. As a product, 
"recombinant microbial" defines a polypeptide or protem essentiaUy free of native 
endogenous substances and unaccompanied by associated native glycosylation. 
Polypeptides or proteins expressed in most bacterial cultures, e.g.. E. coli, will be fiee of 
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glycosylation modifications; polypeptides or proteins expressed in yeast will have a 
glycosylation pattern in general different from those expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a plasmid or phage 
or virus or vector, for expressing a polypeptide from a DNA (RNA) sequence. An 
S expression vehicle can comprise a transcriptional unit comprising an assembly of (1) a 
genetic element or elements having a regulatory role in gene expression, for example, 
promoters or enhancers, (2) a structural or coding sequence which is transcribed into 
mRNA and translated into protein, and (3) appropriate transcription initiation and 
termination sequences. Structural units intended for use in yeast or eukaryotic expression 

10 systems preferably include a leader sequence enabling extracellular secretion of 

translated protein by a host cell. Alternatively, where recombinant protein is expressed 
without a leader or transport sequence, it may include an amino terminal methionine 
residue. This residue may or may not be subsequently cleaved from the expressed 
recombinant protein to provide a final product. 

15 The term "recombinant expression system" means host cells which have stably 

integrated a recombinant transcriptional unit into chromosomal DNA or carry the 
recombinant transcriptional unit extrachromosomally. Recombinant expression systems 
as defined herein will express heterologous polypeptides or proteins upon induction of 
the regulatory elements linked to the DNA segment or syntiietic gene to be expressed. 

20 This term also means host cells which have stably integrated a lecombmant genetic 

element or elements having a regulatory role in gene expression, for example, promote 
or enhancers. Recombinant expression systems as defined herein will express 
polypeptides or proteins radogenous to the cell upon induction of the regulatory elements 
linked to the endogenous DNA segment or gene to be expressed. The cells can be 

25 prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a 
membrane, including transport as a result of signal sequences in its amino add sequence 
when it is expressed in a suitable host cell. "Secreted" proteins include without limitation 
proteins secreted wholly (e.g., soluble proteins) or partially (e,g., receptors) from the cell 

30 in which they are expressed. "Secreted" proteins also include witiiout limitation proteins 
that are transported across the membrane of the endoplasmic reticulum. "Secreted" 
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pro»i,»areaUoi„«nded.oincl»tep«^oon»i™ngn»^c.ls.Bn^«q^ 
(eg Wer.eulci„-lBe«,«e Kras„ey.PA»aVo™g.PJl.a992)C,«.tane4(2):.34 

.143) and factor, released from damaged cells (e.g. In«rl=«li«-1 Recep«»: A.«goms.. 

see Arend, W.P. et. al. (1998) Aiuiu. Rev. Immunol, 16:27-55) 
5 whe«desired.anexpressionvecu>rmaybeaesig«ed«.con«toa"signalor 

,e«to«,«»:e-»hichv«B direct *e polypeptide teoughtemembra^ofacdL Scch 

a se,ue,« may be n.t«ally preset* o„ the polypeptides of the present invention or 
providedftomhetcrologomproteirtsoorcesbytecombinantDNAtectaques. 

Tte term -atringenf is used to tefer to conditions titat are oonunonly underwood 
,0 mti^artasstringent Stringed c»»iitio.s can indudebighly stringent conditi<.s(^^^ 
hytaidi««on.of,.t.r.bo,mdDNAinO,5MNaHPO. 7% sodium dodecylsdf«e(SDS), 

, mMEOTAa.65=C. and washing in 0,«SSCA..l* SDS at68"C). and moderately 

... n oY <2<jr/n 1% SDS at 42''C). Other exemplary 
stiingent conditions (i.e., washing m OJX SSOO.l* !^ at » ) 

hybridization conditions are described herein in ti« examples. 
,5 i,instancesofhybridi«ionotdeoxyoligonucl«>tid».«lditio.»l.xempU«y 

s,ri„genthybridizatio„conditionsincludewasMngin6XSS00.05« sodium 

pyrophosphate at 3rC (for 14.base oUgonudeotidea). 48"C (for n-base oBgos), 55 C 
(for 20-base oUgonucleotides). and 60-C (for 23-bas. oligonucleotide.). 

AS used herein/substantiaUy equivalent- or-substanttany simile" can ,ef«b<A 
20 ton.cleotideandaminoaddse,uencesjorexampleamutantsequenc..th..v«iea«.cm 
.reference seqoencebyoneor mote s«hstiti,tions,deletions, or additions, titeneteff^t 
of which dc«s not resnUin an adverse functi^tal dissimilarity between fltereferenca.^ 
subjectsequence.. lVpicany.s»*.s„bs.a„tianyet,uivale„.-,--vatiesftomoneof 
,ho,elis,edhe,.inb,„omoreth«.ab<«t35*(ic..then»mb«ofi.divid„al,es,*e 
M s«bsfimti<«,.dditions,«.d/ordelettonsina.ubst3ntiaUye<,„ivalen,se,uence,as 

eompued to tt« cor^^ding '-l^- " 

,esiduesinthesubstantiallye,»iv.ten.se.^isatout035orless,.Suchasequence 

is said to have 65% sequence identity to the listed sequence. In one embodiment, a 
s^tially equivalent, e.g., mutant, sequence of the invention vaHea from a hsted 
30 sequencebyoomoretiumSO^CTWsequenceidentiflDiin.vabationofthrs 

embodime«.by»>mc«.ha„25*a5%s.quenceidentity)-,andin.fu«herv.r«tionof 
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this embodiment, by no more than 20% (80% sequence identity) and in a further variation 
of this embodiment, by no more than 10% (90% sequence identity) and in a further 
variation of this embodiment, by no more that 5% (95% sequence identity). SubstantiaUy 
equivalent, e.g., mutant, amino acid sequences according to the invention preferably have 
at least 80% sequence identity with a listed amino acid sequence, more preferably at least 
85% sequence identity, more prefocably at least 90% sequence identity, more preferably 
at least 95% sequence identity, more preferably at least 98% sequence identity, and most 
preferably at Ifcast 99% sequence identity. Substantially equivalent nucleotide sequence 
of the invention can have lower percent sequence identities, taking into account, for 
example, the redundancy or degeneracy of the genetic code. Preferably, tiie nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, 
more preferably at least about 80% sequence identity, more preferably at least 85% 
sequence identity, more preferably at least 90% sequence identity, moine preferably at 
least about 95% sequence identity, more preferably at least 98% sequence identity, and 
most preferably at least 99% sequence identity. For tiie purposes of tiie present 
invention, sequences having substantially equivalent biological activity and substantially 
equivalent expression characteristics are considered substantially equivalent. For the 
purposes of determining equivalence, truncation of the mature sequence (e.g., via a 
mutation which creates a spurious stop codon) should be disregarded. Sequence identity 
may be determined, e.g., using the Jotun Hdn method (Hein, J. (1990) Metiiods 
Enzymol. 183:626-645). Identity between sequences can also be determined by otiier 
methods known in the art, e.g. by varying hybridization conditions. 

The term "totipotent" refjats to tiie capability of a cell to differentiate into all of 
tiie cell types of an adult organism. 

The term "transformation" means intireducingDNA into a suitable host cell so 
tfiat the DNA is replicable, either as an extiachromosomal element, or by chromosomal 
integration. The term "transfection" refers to the taking up of an expression vector by a 
suitable host cell, whether or not any coding sequences are in fact expressed. The term 
"infection" refers to the intix)duction of nucleic acids into a suitable host cell by use of a 
virus or viral vector. 
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As used herein, an "uptake modulating fragment," UMF, means a series of 
nucleotides which mediate the uptake of a Unked DNA fragment into a cell. UMFs can 
be readily identified using known UMFs as a target sequence or target motif with the 
computer-based systems described below. The presence and activity of a UMF can be 
confirmed by attaching the suspected UMF to a marker sequence. The resulting nucleic 
add molecule is then incubated with an appropiiate host under appropriate conditions and 
the uptake of the marker sequence is determined. As described above, a UMF will 
increase the fiequency of uptake of a linked marker sequence. 

Each of the above terms is meant to encompass all that is described for each, 
unless the context dictates otherwise. 

4.2 NUCLEIC ACroS OF THE INVENTION 

Nucleotide sequences of the invention arc set forth in the Sequence listing. 

The isolated polynucleotides of the invention include a polynucleotide comprising 
the nucleotide sequences of SEQ ID NO: 1 - 438; a polynucleotide encoding any one of 
the peptide sequences of SEQ ID NO:l - 438; and a polynucleotide comprising the 
nucleotide sequence encoding tiie mature protein coding sequence of the polynucleotides 
of any one of SEQ ID NO: 1 - 438. The polynucleotides of die present invention also 
include, but are not Umited to, a polynucleotide tiiat hybridizes under stiingent conditions 
to (a) the complement of any of the nucleotides sequences of SEQ ID NO: 1 - 438; (b) 
nucleotide sequences encoding any one of the amino acid sequences set forth in the 
Sequence listing; (c) a polynucleotide which is an allelic variant of any polynucleotide 
recited above; (d) a polynucleotide which encodes a species homolog of any of die 
protdns recited above; or (e) a polynucleotide that encodes a polypeptide comprising a 
specific domain or truncation of the polypeptides of SEQ ID NO: I- 438. Domains of 
interest may depend on the nature of the encoded polypeptide; e.g.. domains in receptor- 
lite polypeptides include ligand-lnnding, extracellular, transmembrane, or cytoplasmic 
domains, or combinations thereof; domains in immunoglobuUn-like proteins include tiie 
variable immunoglobulin-like domains; domains in enzyme-like polypeptides include 
catalytic and substrate binding domains; and domains in ligand polypeptides include 
receptor-binding domains. 
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The polynucleotides of the invention include naturally occurring or wholly or 
partially synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g.. mRNA. The 
polynucleotides may include all of the coding region of the cDNA or may represent a 
portion of the coding region of the cDNA. 
5 The present invention also provides genes corresponding to the cDNA sequraces 

disclosed herein. The corresponding genes can be isolated in accordance with known 
methods using the sequence information disclosed herein. Such methods include the 
preparation of probes or primers from the disclosed sequence information for identification 
and/or amplification of genes in appropriate genomic libraries or other sources of genomic 

10 materials. Further 5' and 3* sequence can be obtained using methods known in the art For 
example, full length cDNA or genomic DNA that corresponds to any of the polynucleotides 
of SEQ ID NO: 1 - 438 can be obtained by screening appropriate cDNA or genomic DNA 
libraries under suitable hybridization conditions using any of the polynucleotides of SEQ ID 
NO: 1 - 438 or a portion thereof as a probe. Altematively, tfie polynucleotides of SEQ ID 

15 NO: 1 - 438 may be used as die basis for suitable primer{s) that allow identification and/or 
amplification of genes in appropriate genomic DNA or cDNA libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and 
sequences (including cDNA and genomic sequences) obtained from one or miore public 
databases, such as dbEST, gbpri, and UniGene. The EST sequences can provide icfentifying 

20 sequence information, representative fragment or segment information, or novel segment 
information for the full-length gene. 

The polynucleotides of the invention also provide polynucleotides including 
nucleotide sequences that ate substantially equivalent to the polynucleotides recited 
above. Polynucleotides according to the invention can have, e.g., at least about 65%, at 

25 least about 70%, at least about 75%, at least about 80%, 81%, 82%, 83%, 84%, more 
typically at least about 85%, 86%, 87%, 88%, 89%, more typically at least about 90%, 
91%, 92%, 93%, 94%, and even more typicaUy at least about 95%, 96%, 97%, 98%, 99% 
sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are 

30 nucleic acid sequence fragments that hybridize under stringent conditions to any of the 
nucleotide sequences of SEQ ID NO: 1 - 438, or complements thereof, which fragment is 
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greater than about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 
nucleotides and most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 
20 nucleotides or more that are selective for (i.e. specifically hybridize to any one of the 
polynucleotides of the invention) are contemplated. Probes capable of specifically 
5 hybridizing to a polynucleotide can differentiate polynucleotide sequaices of the 
invention from other polynucleotide sequences in the same family of genes or can 
differentiate human genes from genes of other species, and are preferably based on 
unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to 

10 these specific sequences, but also include allelic and species variations thereof. Allelic and 
species variations can be routinely determined by comparing the sequence provided in SEQ 
ID NO: 1 - 438, a representative fragment thereof, or a nucleotide sequence at least 90% 
identical, preferably 95% identical, to SEQ ID NOs: 1 - 438 with a sequence from another 
isolate of the same species. Furthermore, to accommodate codon variability, the invention 

15 includes nucleic acid molecules coding for the same amino acid sequences as do the specific 
ORFs disclosed herein. In other words, in the coding region of an ORF, substitution of one 
codon for another codon that encodes the same amino acid is expressly contemplated. 

The nearest neighbor or homology result for the nucleic acids of the present 
invention, including SEQ ID NOs: 1 - 438, can be obtained by searching a database using an 

20 algorithm or a program. Preferably, a BLAST which stands for Basic Local Alignment 
Search Tool is used to search for local sequence alignments (Altshul, S.F. J MoL Evol. 36 
290-300 (1993) and Altschul S.F. et al. J. MoL Biol. 21:403-410 (1990)). Alternatively a 
FASTA version 3 search against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are 

25 also provided by the present invention. Species homologs may be isolated and identified 
by making suitable probes or primers from the sequences provided herein and screening a 
suitable nucleic acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides 
or proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide 

30 which also encode proteins which are identical, homologous or related to that encoded by 
the polynucleotides, 
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The nucleic acid sequences of the invention are further directed to sequences 
which encode variants of the described nucleic acids. These amino acid sequence 
variants may be prepared by methods known in the art by introducing appropriate 
nucleotide changes into a native or variant polynucleotide. There are two variables in the 
construction of amino acid sequence variants: the location of the mutation and the nature 
of the mutation. Nucleic acids encoding the amino acid sequence variants are preferably 
constructed by mutating the polynucleotide to encode an amino add sequence that does 
not occur in nature. These nucleic acid alterations can be made at sites that differ in the 
nucleic acids from different species (variable positions) or in highly conserved regions 
(constant regions). Sites at such locations will typically be modified in senes, e.g., by 
substituting first with conservative choices {e.g., hydrophobic amino acid to a different 
hydrophobic amino acid) and then with more distant choices (e.g., hydrophobic amino 
acid to a charged amino acid), and then deletions or insertions may be made at the target 
site. Amino acid sequence deletions generally range from about 1 to 30 residues, 
preferably about 1 to 10 residues, and are typically contiguous. Amino acid insertions 
include amino- and/or carboxyl-terminal fusions ranging in length from one to one 
hundred or more residues, as well as inti^equence insertions of single or multiple amino 
acid residues. Inti-asequence insertions may range generally from about 1 to 10 amino 
residues, preferably from 1 to 5 residues. Examples of terminal insertions include the 
heterologous signal sequences necessary for secretion or for intracellular targeting in 
different host cells and sequences such as FLAG or poly-histidine sequences useful for 
purifying the expressed protein. 

In a preferred method, polynucleotides encoding the novel amino acid sequences 
ate changed via site-directed mutagenesis. This method uses oUgonucleotide sequences 
to alter a polynucleotide to encode the desired amino acid variant, as weU as sufficient 
adjacent nucleotides on both sides of the changed amino add to form a stable duplex on 
dther side of the site of being changed. In general, the techniques of site-directed 
mutagenesis are well known to tiiose of skiU in the art and this technique is exempUfied 
by publications such as, Edelman et al., DNA 2:183 (1983). A versatile and effident 
method for producing site-specific changes in a polynucleotide sequence was pubUshed 
by Zoller and Smitii, Nucleic Acids Res. 10:6487-6500 (1982). PGR may also be used t( 
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create amino acid sequence variants of the novel nucleic acids. When small amounts of 
template DNA are used as starting material, primer(s) that differs slightly in sequence 
from the corresponding region in the template DNA can generate the desired amino acid 
variant. PCR amplification results in a population of product DNA fragments that diff^ 

5 from the polynucleotide template encoding the polypeptide at the position specified by 
the primer. The product DNA fragments replace the corresponding region in the plasmid 
and this gives a polynucleotide encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et al.. Gene 34:315 (1985); and other mutagenesis 

10 techniques well known in the art, such as, for example, the techniques in Sambrook et al., 
supra, and Current Protocols in Molecular Biology, Ausubel et al. Due to the inherent 
degeneracy of the genetic code, other DNA sequences which encode substantially the 
same or a functionally equivalent amino acid sequence may be used in the practice of the 
invention for the cloning and expression of these novel nucleic acids. Such DNA 

15 sequences include those which ace capable of hybridizing to the appropriate novel nucleic 
acid sequence under stringent conditions. 

Polynucleotides encoding preferred polypeptide truncations of the invention can 
be used to generate polynucleotides encoding chimeric or fusion proteins comprising one 
or more domains of the invention and heterologous protein sequences. 

20 The polynucleotides of the invention additionally include the complement of any 

of the polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, 
amplified, or synthetic) or RNA. Methods and algorithms for obtaining such 
polynucleotides are well known to those of skill in the art and can include, for example, 
methods for determining hybridization conditions that can routinely isolate 

25 polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the 
mature protein coding sequences corresponding to any one of SEQ ID NO: 1-438, or 
functional equivalents thereof, may be used to generate recombinant DNA molecules that 
direct the expression of that nucleic acid, or a functional equivalent thereof, in 

30 appropriate host cells. Also included are the cDNA inserts of any of the clones identified 
herein. 
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A polynucleotide according to the invention can be joined to any of a variety of 
other nucleotide sequences by well-established recombinant DNA techniques (see 
Sambrook J et al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY). Useful nucleotide sequences for joining to polynucleotides include an 
5 assortment of vectors, e.g., plasmids, cosmids, lambda phage derivatives, phagemids, and 
the like, that are well known in the art Accordingly, the invention also provides a vector 
including a polynucleotide of the invention and a host cell containing the polynucleotide. 
In general, the vector contains an origin of replication functional in at least one organism, 
convenient restriction endonuclease sites, and a selectable marker for the host cell. 

10 Vectors according to the invention include expression vectors, replication vectors, probe 
generation vectors, and sequencing vectors. A host cell according to the invention can be 
a prokaryotic or eukaryotic cell and can be a unicellular organism or part of a 
multicellular organism. 

The present invention further provides recombinant constructs comprising a 

15 nucleic acid having any of the nucleotide sequences of SEQ ID NOs: 1 - 438 or a 

fragment thereof or any other polynucleotides of the invention. In one embodiment, the 
recombinant constructs of the present invention comprise a vector, such as a plasmid or 
viral vector, into which a nucleic acid having any of the nucleotide sequences of SEQ ID 
NOs: 1 - 438 or a fragment thereof is inserted, in a forward or reverse orientation. In the 

20 case of a vector comprising one of the ORFs of the present invention, the vector may 
further comprise regulatory sequences, including for example, a promoter, operably 
linked to the ORF. Large numbers of suitable vectors and promoters are known to those 
of skill in the art and are commercially available for generating the recombinant 
constructs of the present invention. The following vectors are provided by way of 

25 example. Bacterial: pBs, phagescript, PsiX174, pBluescript SK, pBs KS, pNHSa, 
pNH16a, pNHlSa, pNH46a (Stratagene); pTrc99A, pKK223-3, pKK233-3, pDR540, 
pRTTS (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) 
pSVK3, pBPV, pMSG, pSVL (Pharmacia). 

The isolated polynucleotide of the invention may be operably linked to an 

30 expression control sequence such as the pMT2 or pED expression vectors disclosed in 
Kaufman et al.. Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein 
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recombinantly. Many suitable expression control sequences are known in the art. 
General methods of expressing recombinant proteins are also known and are exemplified 
in R. Kaufman, Methods in Enzymology 185, 537-566 (1990). As defined herein 
"operably linked" means that the isolated polynucleotide of the invention and an 
5 expression control sequence are situated within a vector or ceU in such a way that the 
protein is expressed by a host cell which has been transformed (transfected) with the 
ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT 
(chloramphenicol transferase) vectors or other vectors with selectable markers. Two 

10 appropriate vectors are pKK232-8 and pCM7. Particular named bacterial promoters 
include lad, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV 
immediate early, HSV thymidine kinase, early and late S V40, LTRs from retrovirus, and 
mouse metallothionein-L Selection of the appropriate vector and promoter is well within 
the level of ordinary skill in the art. Generally, recombinant expression vectors will 

15 include origins of replication and selectable markers permitting transformation of the host 
cell, €,g,, the ampicillin resistance gene of E. coli and S, cerevisiae TRPl gene, and a 
promoter derived from a highly-expressed gene to direct transcription of a downstream 
structural sequence. Such promoters can be derived from operons encoding glycolytic 
enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid phosphatase, or heat 

20 shock proteins, among others. The heterologous structural sequence is assembled in 

appropriate phase with translation initiation and termination sequences, and preferably, a 
leader sequence capable of directing secretion of translated protein into the periplasmic 
space or extracellular medium. Optionally, the heterologous sequence can racode a 
fusion protein including an amino terminal identification peptide imparting desired 

25 characteristics, stabilization or simplified purification of expressed recombinant 
product Useful expression vectors for bacterial use are constructed by inserting a 
stmctural DNA sequence encoding a desired protein together with suitable translation 
initiation and termination signals in operable reading phase with a functional promoter. 
The vector will comprise one or more phenotypic selectable markers and an origin of 

30 replication to ensure maintenance of the vector and to, if desirable, provide amplification 
within the host. Suitable prokaryotic hosts for transformation include E. co/i, Bacillus 

23 



wo 02/081731 



PCTAJS02/01222 



subtilis. Salmonella typhimurium and various species within the genera Pseudomonas, 
Streptomycesj and Staphylococcus^ although others may also be employed as a matter of 
choice. 

As a representative but non-limiting example, useful expression vectors for 
S bacterial use can comprise a selectable marker and bacterial origin of replication derived 
from commercially available plasmids comprising genetic elements of the well known 
cloning vector pBR322 (ATCC 37017). Such commercial vectors include, for example, 
pKK223-3 (Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (Promega 
Biotech, Madison, WI, USA). These pBR322 "backbone" sections are combined with an 

10 appropriate promoter and the structural sequence to be expressed Following 

transformation of a suitable host strain and growth of the host strain to an appropriate cell 
density, the selected promoter is induced or derepressed by appropriate means (e.g., 
temperature shift or chemical induction) and cells are cultured for an additional period. 
Cells are typically harvested by centrifugation, disrupted by physical or chemical means, 

15 and the resulting crude extract retained for further purification. 

Polynucleotides of the invention can also be used to induce immune responses. 
For example, as described in Fan et al., Nat. Biotech. 17:870-872 (1999), incorporated 
herein by reference, nucleic acid sequences encoding a polypeptide may be used to 
generate antibodies against the encoded polypeptide following topical administration of 

20 naked plasmid DNA or following injection, and preferably intra-muscular injection of the 
DNA. The nucleic acid sequences are preferably inserted in a recombinant expression 
vector and may be in the form of naked DNA. 

4.3 ANTISENSE 

25 Another aspect of the invention pertains to isolated antisense nucleic acid 

molecules that are hybridizable to or complementary to the nucleic acid molecule 
comprising the nucleotide sequence of SEQ ID NO: 1 - 438, or fragments, analogs or 
derivatives thereof. An "antisense" nucleic acid comprises a nucleotide sequence that is 
complementary to a "sense" nucleic acid encoding a protein, e.g., complementary to the 

30 coding strand of a double-stranded cDNA molecule or complementary to an mRNA 
sequence. In specific aspects, antisense nucleic acid molecules are provided that 
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comprise a sequence complementary to at least about 10, 25, 50, 100, 250 or 500 
nucleotides or an entire coding strand, or to only a portion thereof. Nucleic acid 
molecules encoding fragments, homologs, derivatives and analogs of a protein of any of 
SEQ ID NO: 1 - 438 or antisense nucleic acids complementary to a nucleic acid sequence 
5 of SEQ ID NO: 1 - 438 are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding 
region" of the coding strand of a nucleotide sequence of ihe invention. The term "coding 
region" refers to the region of the nucleotide sequence comprising codons which are 
translated into amino acid residues. In another embodiment, the antisense nucleic acid 

10 molecule is antisense to a "noncoding region" of the coding strand of a nucleotide 

sequence of the invention. The term "noncoding region" refers to 5' and 3' sequences that 
flank the coding region that are not translated into amino acids (/.e., also referred to as 5* 
and y untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., 

15 SEQ ID NO: 1 - 438, antisense nucleic acids of the invention can be designed according 
to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid 
molecule can be complementary to the entire coding region of an noRNA, but more 
preferably is an oligonucleotide that is antisense to only a portion of the coding or 
noncoding region of an mRNA. For example, the antisense oligonucleotide can be 

20 complementary to the region suaounding the translation start site of an mRNA. An 
antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 
SO nucleotides in length. An antisense nucleic acid of the invention can be constructed 
using chemical synthesis or enzymatic ligation reactions using procedures known in the 
art. For example, an antisense nucleic acid (e,g.^ an antisense oligonucleotide) can be 

25 chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase 
the physical stability of the duplex formed between the antisense and sense nucleic acids, 
phosphorothioate derivatives and acridine substituted nucleotides can be used. 
Examples of modified nucleotides that can be used to generate the antisense 

30 nucleic acid include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, 
hypoxanthine, xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 

25 
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5-caxboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, 
dihydrounicil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 

1- methylguanine, 1-methylinosine, 2,2-dimethylguamne. 2-methyladenine, 

2- niethylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 
5 5-methylaminomethyiaracil, 5-methoxyainiTioinethyl-2-thiouracil, 

beta-D-mannosylqueosine, 5 -methoxycaiboxymethyluracil, 5-medioxyuracil, 
2"methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, 
pseudouracil, queosine, 2-thiocytosine, 5-inethyl-2-thiouracil, 2-thiouracil, 4-thiouraca, 
5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 

10 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 

2,6-diaimnopurine, Alternatively, the antisense nucleic acid can be produced biologically 
using an expression vector into which a nucleic acid has been subcloned in an antisense 
orientation {Le., RNA transcribed from the inserted nucleic acid will be of an antisense 
orientation to a target nucleic acid of interest, described further in the following 

15 subsection). 

The antisense nucleic acid molecules of the invention are typically administered 
to a subject or generated in situ such that they hybridize with or bind to cellular mRNA 
and/or genomic DNA encoding a protein according to the invention to thereby inhibit 
expression of the protein, e.g.^ by inhibiting transcription and/or translation. The 

20 hybridization can be by conventional nucleotide complementarity to form a stable duplex, 
or, for example, in the case of an antisense nucleic acid molecule that binds to DNA 
duplexes, through specific interactions in the major groove of the double helix. An 
example of a route of administration of antisense nucleic acid molecules of the invention 
includes direct injection at a tissue site. Alternatively, antisense nucleic acid molecules 

25 can be modified to target selected cells and then administered systemically. For example, 
for systemic administration, antisense molecules can be modified such that they 
specifically bind to receptors or antigens expressed on a selected cell surface, eg., by 
linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell 
surface receptors or antigens. The antisense nucleic acid molecules can also be delivered 

30 to cells using the vectors described herein. To achieve sufficient intracellular 

concentrations of antisense molecules, vector constructs in which the antisense nucleic 
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acid molecule is placed under the control of a strong pol n or pol HI promoter are 
preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is 
an a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms 
5 specific double-stranded hybrids with complementary RNA in which, contrary to the usual 
a-units, the strands run parallel to each other (Gaulder et al. (1987) Nucleic Acids Res 15: 
662S-6641). The antisense nucleic acid molecule can also comprise a 
2'-o-methylribonucleotide (Inoue et al (1987) Nucleic Acids Res 15: 6131-6148) or a 
chimeric RNA -DNA analogue (Inoue et al (1987) FEBS Lett 215: 327-330). 

10 

4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a 
ribozyme. Ribozymes are catalytic RNA molecules with ribonuclease activity that are 
capable of cleaving a single-stranded nucleic acid, such as an mRNA, to which they have 

15 a complementary region. Thus, ribozymes (e.^., hammerhead ribozymes (described in 
Haselhoff and Gerlach (1988) Nature 334:585-591)) can be used to catalytically cleave 
mRNA transcripts to thereby inhibit translation of an mRNA. A ribozyme having 
specificity for a nucleic acid of the invention can be designed based upon the nucleotide 
sequence of a DNA disclosed herein (Le., SEQ ID NO: 1 - 438). For example, a 

20 derivative of Tetrahymena L-19 TVS RNA can be constnicted in which the nucleotide 

sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 
SECX-encodingmRNA. See,c.g,,Cechera/. U.S. Pat, No. 4,987,071; and Cechefol. 
U.S. Pat. No. 5,1 16,742. Alternatively, SECX mRNA can be used to select a catalytic 
RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g. , 

25 Barrel et al., (1993) Science 261:1411-1418. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple 
helical structures that prevent transcription of the gene in target cells. See generally, 
Helene. (1991) Anticancer Drug Des, 6: 569-84; Helene. et al. (1992) Ann, NY, Acad 

30 ScL 660:27-36; and Maher (1992) Bioassays 14: 807-15. 
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In various embodiments, the nucleic acids of the invention can be modified at the 
base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, 
hybridization, or solubility of the molecule. For example, the deoxyribose phosphate 
backbone of the nucleic acids can be modified to generate peptide nucleic acids (see 
5 Hyrup et al. (1996) Bioorg Med Chem 4: 5-23). As used herein, the terms "peptide 
nucleic acids" or "PNAs" refer to nucleic acid mimics, e.g., DNA mimics, in which the 
deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the 
four natural nucleobases are retained. The neutral backbone of PNAs has been shown to 
allow for specific hybridization to DNA and RNA under conditions of low ionic strength. 

10 The synthesis of PNA oligomers can be performed using standard solid phase peptide 
synthesis protocols as described in Hyrup et al. (1996) above; Perry-O'Keefe ei al. (1996) 
PNAS 93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific 

15 modulation of gene expression by, e.g., inducing transcription or translation arrest or 
inhibiting replication. PNAs of the invention can also be used, e.g., in the analysis of 
single base pair mutations in a gene by, e.g., PNA directed PGR clamping; as artificial 
restriction enzymes when used in combination with other enzymes, e.g., SI nucleases 
(Hymp B. (1996) above); or as probes or primers for DNA sequence and hybridization 

20 (Hymp et al. (1996), above; Peny-OKeefe (1996), above). 

In another embodiment, PNAs of the invention can be modified, e.g., to enhance 
their stability or ceUular uptake, by attaching lipophilic or other helper groups to PNA, by 
the formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of 
dmg deliveiy known in the art. For example, PNA-DNA chimeras can be generated that 

25 may combine the advantageous properties of PNA and DNA. Such chimeras allow DNA 
recognition enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA 
portion while the PNA portion would provide high binding affinity and specificity. 
PNA-DNA chimeras can be linked using linkers of appropriate lengths selected in terms 
of base stacking, number of bonds between the nucleobases, and orientation (Hyrup 

30 (1996) above). The synthesis of PNA-DNA chimeras can be performed as described in 
Hyrup (1996) above and Finn et al (1996) Nucl Acids Res 24: 3357-63. For example, a 

28 



wo 02/081731 



PCTAJS02/01222 



DNA chain can be synthesized on a solid support using standard phosphoramidite 
coupling chemistry, and modified nucleoside analogs, e.g., 

5 -(4-methoxytrityl)amino-5 -deoxy-thymidine phosphoramidite, can be used between the 
PNA and the 5' end of DNA (Mag et cd. (1989) Nucl Acid Res 17: 5973-88). PNA 

5 monomers are then coupled in a stepwise manner to produce a chiineric molecule with a 
5* PNA segment and a 3* DNA segment (Finn et al. (1996) above). Alternatively, 
chimeric molecules can be synthesized with a 5' DNA segment and a 3' PNA segment. 
See, Petersen et al. (1975) Bioorg Med Chem Lett 5: 1119-11124, 

In other embodiments, the oligonucleotide may include other appended groups 

10 such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating 

transport across the cell membrane (see, e.g., Letsinger et al., 1989. Proc. Natl Acad. Sci. 
U.S.A 86:6553-6556; Lemaitie et al., 1987, Proc. Natl Acad. Set 84:648-652; PCX 
Publication No. W088/09810) or the blood-brain barrier (see, e.g., PCT Publication No. 
W089/10134). In addition, oligonucleotides can be modified with hybridization triggered 

15 cleavage agents (See, e.g., Krol et al, 1988, BioTechniques 6:958-976) or intercalating 
agents. (See, e.g., Zon, 1988, Pharm. Res. 5: 539-549). To this end, the oligonucleotide 
may be conjugated to another molecule, e.g., a peptide, a hybridization triggered 
cross-linking agent, a transport agent, a hybridization-triggered cleavage agent, etc. 

20 4^ HOSTS 

The present invention further provides host cells genetically engineered to contain 
the polynucleotides of the invention. For example, such host cells may contain nucleic 
adds of the invention introduced into the host cell using known transformation, 
transfection or infection methods. The present invention still further provides host cells 

25 genetically engineered to express the polynucleotides of the invention, wherein such 
polynucleotides are in operative association with a regulatory sequence heterologous to 
the host cell which drives expression of the polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, 
or increase, expression of endogenous polypeptide. Cells can be modified (e.g., by 

30 homologous recombination) to provide increased polypeptide expression by replacing, in 
whole or in part, the naturally occurring promoter with all or part of a heterologous 
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promoter so that the cells express the polypeptide at higher levels. The heterologous 
promoter is inserted in such a manner that it is operatively linked to the encoding 
sequences. See. for example, PCX International Publication No. WO94/12650, PCX 
International Publication No. WO92/20808, and PCX International Publication No. 

5 WO9I/09955. It is also contemplated that, in addition to heterologous promoter DNA, 
amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase. and dihydroorotase) 
and/or intron DNA may be inserted along with the heterologous promoter DNA. If 
linked to the coding sequence, amplification of the marker DNA by standard selection 

10 methods results in co-amplification of the desired protein coding sequences in the cells. 

The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a 
lower eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, 
such as a bacterial cell. Introduction of the recombinant construct into the host cell can 
be effected by calcium phosphate tnmsfection, DEAE, dextran mediated transfection, or 

15 electroporation (Davis, L. et al., Basic Methods in Molecular Biology (1986)). The host 
cells containing one of the polynucleotides of the invention, can be used in conventional 
manners to produce the gene product encoded by the isolated fragment (in the case of an 
ORF) or can be used to produce a heterologous protein under the control of the EMF. 
Any host/vector system can be used to express one or more of the ORFs of the 

20 present invention. These include, but are not limited to, eukaryotic hosts such as HeLa 
cells, Cv-1 cell, COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. 
coli and B. subtUis. The most prefened cells are those which do not normally express the 
particular polypeptide or protein or which expresses the polypeptide or protein at low 
natural level Mature proteins can be expressed in manmialian cells, yeast, bacteria, or 

25 other cells under the control of appropriate promoters. Cell-free translation systems can 
also be employed to produce such proteins using RNAs derived firom the DNA constructs 
of the present invention. Appropriate cloning and expression vectors for use with 
prokaryotic and eukaryotic hosts are described by Sambrook, et al., in Molecular 
Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New York (i989), 

30 the disclosure of which is hereby incorporated by reference. 
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Various mammalian cell culture systems can also be employed to express 
recombinant protein. Examples of mammalian expression systems include the COS-7 
lines of monkey kidney fibroblasts, described by Gluzman, Cell 23:175 (1981). Other 
cell lines capable of expressing a compatible vector are, for example, the C127, monkey 

5 COS cells, Chinese Hamster Ovaiy (CHO) cells, human kidney 293 cells, human 
epidermal A431 cells, human Colo205 cells, 3T3 cells, CV4 cells, other transformed 
primate cell lines, normal diploid cells, cell strains derived ficom in vitro culture of 
primary tissue, primary explants, HeLa cells, mouse L cells, BHK, HL-60, U937, HaK or 
Jurkat cells. Manmialian expression vectors will comprise an origin of replication, a 

10 suitable promoter and also any necessary ribosome binding sites, polyadenylation site, 
splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for 
example, S V40 origin, early promoter, enhancer, splice, and polyadenylation sites may be 
used to provide the required nontranscribed genetic elements. Recombinant polypeptides 

15 and proteins produced in bacterial culture are usually isolated by initial extraction from 
cell pellets, followed by one or more salting-out, aqueous ion exchange or size exclusion 
chromatography steps. Protein refolding steps can be used, as necessary, in completing 
configuration of the mature protein. Finally, high performance liquid chromatography 
(HPIX;) can be employed for final purification steps. Microbial cells employed in 

20 expression of proteins can be disrupted by any convenient method, including freeze-thaw 
cycling, sonication, mechanical disruption, or use of cell lysing agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such 
as yeast or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains 
include Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, 

25 Candida, or any yeast strain capable of expressing heterologous proteins. Potentially 
suitable bacterial strains include Escherichia coli. Bacillus subtilis. Salmonella 
typhimurium, or any bacterial strain capable of expressing heterologous proteins. If the 
protein is made in yeast or bacteria, it may be necessary to modify the protein produced 
therein, for example by phosphorylation or glycosylation of the appropriate sites, in order 

30 to obtain the functional protein. Such covalent attachments may be accomplished using 
known chemical or enzymatic methods. 
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In another embodiment of U.e pn«nt in™>to>. cells .nd tissues may be 
engi,,ee,e<itoexp,essanen,iogeno»sgenecomprising.hepol)™ucleo,idesof.he 

reI.ion».e.Lcon»,„nn..ib,e..u>a.o,,e,eme„u.i.»«chc..e«».^^ 
^^ncesoftheendogenoosgenemaybereplacedbyhonKJogous^con*^ As 

.ynthesi^dbygenedcengineenngmethods. Such»gol«oo-se,«nc.sm.,l« 
compns«i of pn=mo.«s. e«h»<«s. scaffold-afachmen. .gions, negauve tegul^ 

,0 «,saiase,»ences. AK«»a.e.y, sequences «hich.lf.c. the, s,r„c«»o.stab,U^of^^ 

^^T*csese,«e»ceincl»depolyaden,.a.ionsignals,mRNAsUbmtye.e»n.s^ 
3pLs..es..eaderse^.ore,*««*g«moaifying.ranspo«orsec,e«>n^^ 
. of.hep««eta,oro.he,se,uenc.swhiob.ltero,l»p.o«.hefm.ctionors.ab>h,yof 

15 motein or RNA molecules. . 

The,arge,mgeven,maybea»impleinse«i<H.of.ta»g»la.oo.se<,aence,plac,ng 

U« gene under the contn,. of d«»w .^se^ inse«i.ga.«wp»-" 
orenhancerorbotbupst^^nof.gene. A«em.,ive.y..he.«^eve«tmaybe. 
Simple de.ettonofaregu>a.o,ye.emenusuch..hede.edonof..iss«e^™« 
M regnlatoryelen^nt Alternatively, .ta.argeang.»«um.y«pta.". «^ 
rl.n^,e,atiaane-specifice„b.»erc.n>«»pU«db,»»*»cer.h«h»b^ 
ordiffete«c.ll-typesp.cifidty.han.hena««myoccaningete.»ots^H»^ 
.atn^ayoccotring sequences are deietedandt^wse^^nces^-lded 1n-Ic«es..he 
ia=«ifioa,io„of.he.argednge.entmaybefacili.atedby.heuse«*o„eormo,e 
25 «Uotablemarkergenes,l,a.a.conSgu<»swiththe,arge.ingDNA.-U.wu,g orthe 

L,«aonofe.,lsin«bich.beexoge„onsDNAhasin.e^tedin.o,.,.hoa.ceag««^ 
Tieiden.ffic«io„oftl»targe.ingeven.m.ya.sobe,acUita.edby«.e«,erfoneormo« 

marlter genes exbibi.i,.g.heprope..y of negative.e,ec,ion.s»chthattheneg,«.ely 

aelectAle m^^ter is lintei «> the exogenous DNA. b». configured such that 
30 ,.ga.i,eI,se.ec.ablemariterfla..ta.he.a,geaugse,uence.ands«h,ha,acor«« 

holologo»s.ecombin.doneve«»id.se,u«Kesind»hos.ceUge«ou>edoesno,res„l. 
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i„tos«ein.eg.Uo.ofU,e„egaUvc,ysc,«c«btem«1».M«k«»««Mte^ 
p^WudeU^Herpe. Simplex Vi»*,mi*eW™»CnC)g«.eor*ebac«ni. 

xmauK-BMiiiMphosphoribosyl-tninsferaseCEpOgene. 

5 ^d.Ais.sp«.ofa«taventio„"— P-ti-Wy-^'"'^'""-''''^"^"- 
5Z72,(m«.ChappAU.S.P..e„.No.5^78,461«.She™inetA;6..cm«.ond 

AppUcaiooNo.PCr/US92A»627(W09M»222)bySeldene.a..;andln.cn»«<^ 

A»Hc*.No.PCrmS9<«06436(W091A»667)b,Sko«l,chi«al..eaohof*ch» 

toooijKWted by reference hereiii in to cntety. 

10 

4« POLYPEITIDESOFTHE INVENTION 

T*e isolaedpoln>epdde.<rfaei.vention include, but «e««Um,^.o^>^^ 
po,ypeptideccn^»ing:flKannno»id»que„ce,«.fo*--y"«ofSEQIDN^^^^ 
438»anan»noadd«,uc„c.««odedb,»y<»«of«».ueteo«dese,»e„cesS^^ 
15 NOs i-438<.d«co«..p<««SngMleng*orm«arep««ta.r<>l,pepUdesoffl« 

i„ye„Uonai.oinc.udepo.ypeptide,pre,er*.ywifl.bio.ogic.larin^«no.ogic..««^ 

^.aree„codedby:(a)apol,nuc.e«ideb.ving».,o„e<.*en»eteo.de«,uenc^^ 
,„«h in SEQIDNOs:l-438o,(b)polynucle«ides eroding »,«»rf*eanuno«d 

«<,„e„cesse.fo*asSEQIDNO:M38o,(c)polynncle«i*s*«h,brite»d» 
a, c«nptanen.o,U,epo.ynucIeo.deso,ei«»r(a)«(b)u„de,«ri^b^^» 
cJtt<,ns.ll«invenUonal.op«,videsbiologicaDy«dve«iinn«ndog.c.n,^^ 
™ianUofanyo,*e3reinoacidse,»e„ccsse.fo«h»SEQn)NO:M38«*e 

^^«.en^ 0, n,a«rep».ein; a„d"»ubs«.«ale,ui,.len.s *«~f (^^. 
wid.«le.«abou.65*.«lea.tabc„.70%.a.leas.about75%.a.leas.dx,«.a«.« 
» tea«*o«85*.86*,87*.88%.89%,atlea«.abom90%.91%,92*.93%.94% 

« least rt>o« 95%. 96*. 97%. nK-e typically a. leaa. about 98%. most 
tvpicall, at le«,t about 99% »nino acid identity) that retain biological acbvity. 
Mypepddes ««oded by allcUc variants may have a sitnilar, increased, or decreased 
activitycon.paredtopolypepddeacon.prisingSEQIDNO:M38. 

30 ^Jen.sofLp««einsottheprea™^ 

hiologicdacdvity are also encon,passedby.hep«sentinvenUon.Fn«n»nUof the 
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protein may be in linear form or they may be cyclized using known methods, for 
example, as described in H. U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and 
in R. S. McDowell, et al., J. Amer. Chem. Soc. 114. 9245-9253 (1992), both of which are 
incoiporated herein by reference. Such fragments may be fiised to carrier molecules such 
as immunoglobulins for many purposes, including increasing the valency of protein 
binding sites. 

The present invention also provides both full-length and mature forms (for 
example, without a signal sequence or precursor sequence) of the disclosed proteins. The 
protein coding sequence is identified in the sequence listing by translation of the 
disclosed nucleotide sequences. The mature form of such protein may be obtained by 
expression of a full-length polynucleotide in a suitable mammalian cell or other host cell. 
The sequence of the mature form of the protein is also determinable from the amino acid 
sequence of the full-length form. Where proteins of the present invention are membrane 
bound, soluble forms of the proteins are also provided. In such forms, part or all of the 
regions causing the proteins to be membrane bound are deleted so that ttie proteins are 
fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable 
carrier, such as a hydrophilic, e.g., pharmaceutically acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the 
nucleic acid fragments of the present invention or by degenerate variants of the nucleic 
acid fragments of the present invention. By "degenerate variant" is intended nucleotide 
fragments which differ from a nucleic acid fragment of the present invention (e.g., an 
ORF) by nucleotide sequence but, due to the degeneracy of the genetic code, encode an 
identical polypeptide sequence. Preferred nucleic acid firagments of the present invention 
are the ORFs that encode protdns. 

A variety of methodologies known in the ait can be utilized to obtain any one of 
the isolated polypeptides or proteins of the present invention. At the simplest level, the 
amino acid sequence can be synthesized using commercially available peptide 
synthesizers. The synthetically-consOructed protdn sequences, by virtue of sharing 
primary, secondary or teitiary stiiictural and/or confomiational characteristics with 
protdns may possess biological proporties in common therevdth, including protein 
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activity. This technique is particularly useful in producing small peptides and fragments 
of larger polypeptides. Fragments are useful, for example, in generating antibodies 
against the native polypeptide. Thus, they may be employed as biologically active or 
inununological substitutes for natural, purified proteins in screening of therapeutic 
5 compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be 
purified from cells which have been altered to express the desired polypeptide or protein. 
As used herein, a cell is said to be altered to express a desired polypeptide or protein 
when the cell, through genetic manipulation, is made to produce a polypeptide or protein 
10 which it noxmally does not produce or which the cell nonnally produces at a lower level. 
One skilled in the art can readily adapt procedures for introducing and expressing either 
recombinant or synthetic sequences into eukaryotic or prokaryotic cells in order to 
generate a cell which produces one of the polypeptides or proteins of the present 
invention. 

15 The invention also relates to methods for producing a polypeptide comprising 

growing a culture of host cells of the invention in a suitable culture medium, and 
purifying the protein from the cells or the culture in which the cells are grown. For 
example, the methods of the invention include a process for producing a polypeptide in 
which a host cell containing a suitable expression vector that includes a polynucleotide of 

20 the invention is cultured under conditions that allow expression of the encoded 

polypeptide. The polypeptide can be recovered from the culture, conveniently from the 
culture medium, or from a lysate prepared from the host cells and further purified. 
Preferred embodiments include those in which the protein produced by such process is a 
fuU length or mature form of the protein. 

25 In an alternative method, the polypeptide or protein is purified from bacterial 

cells which naturally produce the polypeptide or protein. One skilled in the art can 
readily follow known methods for isolating polypeptides and proteins in order to obtain 
one of the isolated polypeptides or proteins of the present invention. These include, but 
are not limited to, immunochromatography, HPLC, size-exclusion chromatography, 

30 ion-exchange chromatography, and immuno-affinity chromatography. See, e.g.. Scopes, 
Protein Purification: Principles and Practice, Springer- Verlag (1994); Sambrook, et al., 
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in Molecular Cloning: A Laboratory Manual; Ausubel et al.. Current Protocols in 
Molecular Biology. Polypeptide fragments that retain biological/immunological activity 
include fragments comprising greater than about 100 amino acids, or greater than about 
200 amino acids, and fragments that encode specific protein domains. 
5 The purified polypeptides can be used in in vitro binding assays which are well 

known in the art to identify molecules which bind to the polypeptides. These molecules 
include but are not limited to, for e.g., small molecules, molecules from combinatorial 
libraries, antibodies or other proteins. The molecules identified in the binding assay are 
then tested for antagonist or agonist activity in in vivo tissue culture or animal models 
10 that are well known in the art. In brief, the molecules are titrated into a plurality of cell 
cultures or animals and then tested for either cell/animal death or prolonged survival of 
the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the 
peptides may be complexed with toxins, e.g., ricin or cholera, or with other compounds 
15 that are toxic to cells. The toxin-binding molecule complex is then targeted to a tumor or 
other cell by the specificity of the binding molecule for SEQ ID NO: 1-438. 

The protein of the invention may also be expressed as a product of transgenic 
animals, e.g,, as a component of the milk of transgenic cows, goats, pigs, or sheep which 
are characterized by somatic or germ cells containing a nucleotide sequence encoding the 
20 protein. 

The proteins provided herein also include proteins characterized by amino acid 
sequences similar to those of purified proteins but into which modification are naturally 
provided or deliberately engineered. For example, modifications, in the peptide or DNA 
sequence, can be made by those skilled in the art using known techniques. Modifications 

25 of interest in the protein sequences may include the alteration, substitution, replacement, 
insertion or deletion of a selected amino acid residue in the coding sequence. For 
example, one or more of the cysteine residues may be deleted or replaced with another 
amino acid to alter the conformation of the molecule. Techniques for such alteration, 
substitution, replacement, insertion or deletion are well known to those skilled in the art 

30 (see, e,g,, U.S. Pat. No. 4,518,584). Preferably, such alteration, substitution, replacement, 
insertion or deletion retains the desired activity of the protein. Regions of the protein that 
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are important for the protein function can be determined by various methods kiiown in 
the art including the alanine-scanning method which involved systematic substitution of 
single or strings of amino acids with alanine, followed by testing the resulting 
alanine-containing variant for biological activity. This type of analysis determines the 
5 importance of the substituted amino acid(s) in biological activity. Regions of the protein 
that are important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be 
expected to retain protein activity in whole or in part and are useful for screening or other 
immunological methodologies may also be easily made by those skilled in the art given 
10 the disclosures herein. Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide 
of the invention to suitable control sequences in one or more insect expression vectors, 
and employing an insect expression systenL Materials and methods for 
• baculovirus/insect cell expression systems are commercially available in kit form from, 
15 €.g., Invitrogen, San Diego, Calif., U,S.A. (the MaxBat™ kit), and such methods are well 
known in the art, as described in Summers and Smith, Texas Agricultural Experiment 
Station Bulletin No. 1555 (1987), incorporated herein by reference. As used herein, an 
insect cell capable of expressing a polynucleotide of the present invention is 
"transformed." 

20 The protein of the invention may be prepared by culturing transformed host cells 

under culture conditions suitable to express the recombinant protein. Hie resulting 
expressed protein may then be purified from such culture (Le.^ from culture medium or 
cell extracts) using known purification processes, such as gel filtration and ion exchange 
chromatography. The purification of the protein may also include an affinity column 

25 containing agents which will bind to the protein; one or more column steps over such 
affinity resins as concanavalin A-agarose, heparin-toyopearl™ or Cibacrom blue 3GA 
Sepharose™; one or more steps involving hydrophobic interaction chromatography using 
such resins as phenyl ether, butyl ether, or propyl ether; or immunoaffinity 
chromatography. 

30 Alternatively, the protein of the invention may also be expressed in a form which 

will facilitate purification. For example, it may be expressed as a fusion protein, such as 
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those of maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin 
(TR30, or as a His tag. Kits for expression and purification of such fusion proteins are 
commercially available from New England BioLab (Beverly, Mass.), Pharmacia 
(Piscataway, N J.) and fcvitrogen, respectively. The protein can also be tagged with an 
5 epitope and subsequently purified by using a specific antibody directed to such epitope. 
One such epitope ("FLAG®") is commercially avaUable from Kodak C^ew Hav«», 
Coim.). 

Rnally, one or more reverse-phase high petf ormance liquid chromatography (RP- 
HPLC) steps employing hydrophobic RP-HPLC media, e.g., siUca gel having pendant 

10 methyl or other aliphatic groups, can be employed to further purify the protein. Some or 
all of the foregoing purification steps, in various combinations, can also be employed to 
provide a substantially homogeneous isolated recombinant protein. The protein thus 
purified is substantially free of other mammalian proteins and is defined in accordance 
with the present invention as an "isolated protein." 

15 The polypeptides of the invention include analogs (variants). This embraces 

fragments, as well as peptides in which one or more amino acids has been deleted, 
inserted, or substituted. Also, analogs of the polypeptides of the invention embrace 
fusions of the polypeptides or modifications of the polypeptides of the invention, wherein 
the polypeptide or analog is fused to another moiety or moieties, e.g., targeting moiety or 

20 another therapeutic agpnt. Such analogs may exhibit improved properties such as activity 
and/or stability. Exanq>les of moieties which may be fused to the polypeptide or an 
analog include, for example, targeting moieties which provide for the delivery of 
polypeptide to pancreatic ceUs, e.g., antibodies to pancreatic cells, antibodies to immune 
cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well as receptor 

25 and ligands expressed on pancreatic or immune ceDs. Other moieties which may be 
fused to die polypeptide include tiierapeutic agents which are used for tiieatment, for 
example, immunosuppressive drugs such as cyclospwin, SK506, azalhioprine, CD3 
antibodies and steroids. Also, polypeptides may be fused to immune modulators, and 
other cytokines such as alpha or beta interferon. 
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4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE 

IDENTITY AND SIMILARITY 

Preferred identity and/or siimlarity are designed to give the largest match 
between the sequences tested. Methods to determine identity and similarity are codified 
5 incomputerprogramsincluding,butarenotUmitedto.theGCGprogrampackage, 
including GAP (Devereux,J.,etal., Nucleic Acids Researchl2(l):387 (1984); Gene^^ 

Computer Group. University of Wisconsin. Madison, WI), BLASTP, BLASTN. 
BLASTX.FASTA(Altschul.SJ'.etal.,J.Molec.Biol.215:403-410(1990).PSI-BLAST 

(Altschul Si', et al.. Nucleic Acids Res. vol. 25, pp. 3389-3402. herein incorporated by 
10 reference). eMatrix software (Wu et al.. J. Comp. Biol.. Vol. 6. pp. 219-235 (1999), 

herein incorporated by reference). eMotif software (NeviU-Manning et al, ISMB-97. Vol. 
4 pp 202-209, herein incorporated byreference).pFamsoftware(Sonnhammeretal.. 

Nucleic Acids Res.. Vol. 26(1). pp. 320-322 (1998). herein incorporated by reference) 
and the Kyte-Doolitfle hydrophobocity prediction algorithm (J. Mol Biol. 157, pp. 105-31 
15 (i982),incorporatedhereinbyreference). Hie BLAST programs are publicly available 
from the National Center for Biotechnology Information (NCBI) and other sources 
(BLAST Manual. Altschul. S.. et al. NCB NLM NIH Bethesda. MD 20894; Altschul. S., 
et al.. J. Mol. Biol. 215:403-410 (1990). 

4 7 CHIMERIC AND FUSION PROTEINS 

20 Theinventionalsoprovideschimericorfusionproteins. As used herem, a 

"chimeric protein" or "fusion protein" comprises a" polypeptide of the invention 
operativelyHnkedtoanotherpolypeptide. Within a fusion protein die polypeptide 
accordingtotheinventioncan correspond to all oraportionofaprotein according to the 

invention. In one embodiment, a fusion protein comprises at least one biologically active 
25 portionofapioteinaccordingtotheinvention. In another embodiment, a fusion protem 
comprises at least two biologicaUy active portions of a protdn according to the invenuon. 
Within the fusion protein, the term "operatively Unked" is intended to indicate that the 
polypeptide according to the invention and the other polypeptide are fused m-frame to 
each other. The polypeptide can be fused to the N-tennitius or C-terminus. or to the 
30 middle. 



39 



wo 02/081731 



PCT/US02/01222 



For example, in one embodiment a fusion protein comprises a polypeptide 
according to the invention operably Unked to the extraceUular domain of a second 

protein. . . u u 

. In another embodiment, the fusion protein is a GST-fusion protein in which the 

5 polypeptide sequences of the invention are fused to the C-terminus of the GST (i.e., 
glutathione S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobuUn fusion protem m 
which the polypeptide sequences according to the invention comprise one or more 
domains fused to sequences derived from a member of the immunoglobuUn protem 

10 family. The immunoglobulin fusion proteins of the invention can be incorporated mto 

pharmaceutical compositions and administered to a subject to mhibit an interaction 
between a Ugand and a protein of the invention on the surf ace of a cell, to thereby 

suppress signal transduction in vn>o. The immunoglobulin fusion protdns can be used to 
affect the bioavailabiUty of a cognate ligand. Inhibition of the ligand/protein interaction 
15 maybeusefultherapeuticallyforboththetrcatmentofproliferativeanddifferenuatrve 

disorders, e.g., cancer as well as modulating (..g.. promoting or inhibiting) cell survrval. 
Moreover, the immunoglobulin fusion proteins of the mvention can be used as 
unmunogens to produce antibodies in a subject, to purify Ugands. and in screenmg assays 
to identify molecules that inhibit the interaction of a polypeptide of the invention wrth a 
20 ligand. 

A chimeric or fusion protein of the invention can be produced by standard 
recombinantDNA techniques. For example, DNA fragments coding for the different 
polypeptide sequences are Hgated together in-frame in accordance with conventional 
techniques, eg., by employing blunt-ended or stagger-ended termini for Ugation. 

25 restriction enzyme digestion to provide for appropriate termini, filling-in of cohesive ends 
as appropriate, alkalme phosphatase treatment to avoid undesirable joimng, and 
enzymatic ligation. In another embodiment, the fusion gene can be synthesized by 
conventional techniques including automatedDNA synthesizers. Alternatively. PGR 
amplification of gene fragments can be carried out using anchor primers that give nse to 

30 complementary overhangs between two consecutive gene fragments that can 

subsequently be annealed andreamplifiedto generate a chimeric gene sequence (see, for 
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example. Ausubel e. al. (eds.) C»«ffi«r PEOTOC<« «M<«cm^BK>U>GV, Iota 
Wiley Sc son., 1992). Moreover, many expression vecBrs «e ««aUy .vaJrf,le 
to. already encode a fnsior, moiery (eg., a GST polypepUde). A nucleic acid encoding a 
pdypeptide of invention can be cloned inu> such an expn=aaion vector such tha. Ae 
5 teion moiety is linked ta-ftame to the poiein of the invention. 

M CENETHERAPy 

Mutations in polynucleotides of the invention gene may lesnlt in loss of 
^ (taction of the encoded ptotein. UK invention Om provides gene tt,erapy to 
,0 reaore nonnd activity of the polypeptides of a« invention; or to treat disease states 
invo.vingpolypeptidesoftl«invention.D.live.yofafunctionalgencencod,ng 
polypeptides <rf the invention to apptopriate ecus is effected « vivo, in or i» v,v. by 

use of vector, ».d mo» particularly viral vectors (e.g.. adenovirus, adeno-associated 
™„s,or.r«rovirus).or«vr«.l^ use of physicalDNA transfer methods (e.g.. 

15 liposomes or chemical .re«m=n.s). See. f » example. Ander»». Natine. supplement to 
vol 392. no. 6679. pp.25-20 (1998). For additional n^ews of gene therapy «chnology 
see Friedmsnn, Scie^e. 244: 1275-1281 (1989); Venna, Scnmtific Amencan; 68-84 
(1990)- and Miller, Nature. 357: 45^460 (1992). Introduction of any one of *e 

nucleotides of the ptesent invention or a gene e«»ding ti« polypeptides of the preseW 
20 invention can also be accomplished ^ithextrachromosomalad^trates (transit 

expressionjorartiflcialchromoson^s (stable expression). CeBs may d«. be col«n«J ex 
vivo in presence of proteins of the present invention in order to proliferate or «. 
p^adesiredeffectonoractivityinsochcells. -Heated cells c«, then l« ...reduced 
h^w-fortherapeutic purposes. Alternatively, it is con.empla«dti»t in other hun« 
25 diseases,ates.preventingU,eexpressionofori,d«bitingU«activiQ,ofpolypeptidesof 
tire invention willbe useful in tre^ingthe disease states. I. is contemplatedftatantisense 

therapy or gene tiKcrapy could be applied to negatively regulate U« expKssion of 

polypeptides of the invention. 

OtherinedtodsinMbittngexpressionofaproteinincludetheintioducttonof 

30 .ntisensemol.cul.stotl»nucl«ca«dsof««P«se«inventio„,*eirc«mple..ent^orto 
«„daedRNAsequ«.c«..bym«hodskno«nintt.eart B«h.r. .he polypeptides of the 
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present invention can be inhibited by using targeted deletion methods, or the insertion of a 
negative regulatory element such as a silencer, which is tissue specific. 

The present invention still fiirtfier provides cells genetically engineered in vivo to 
express the polynucleotides of the invention, wherem such polynucleotides are in operative 

5 association with a regulatory sequence heterologous to the host cell which drives expression 
of the polynucleotides in the cell, Tliese methods can be used to increase or decrease the 
expression of the polynucleotides of the present invention- 
Knowledge of DNA sequences provided by the invention allows for modification of 
cells to permit, increase, or decrease, expression of endogenous polypeptide. Cells can be 

10 modified (e.g., by homologous recombination) to provide increased polypeptide expression 
by replacing, in whole or in part, the naturally occurring promoter with all or part of a 
heterologous promoter so that the cells express the protein at hi^er levels. The heterologous 
promoter is inserted in such a manner that it is operatively linked to the desired protem 
• encoding sequences. See, for example, PCX International Publication No. WO 94/12650, 

15 PCX International Publication No. WO 92/20808, and PCX International Publication No. 
WO 91/09955. It is also contemplated that, in addition to heterologous promoter DNA, 
amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which encodes 
caibamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or intron 
DNA may be inserted along with tiie heterologous promoter DNA. If linked to the desired 

20 protein coding sequence, amplification of the marker DNA by standard selection methods 
results in co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be en^eered 
to express an endogenous gene comprising the polynucleotides of tiie invention und^ the 
control of inducible regulatory elements, in which case the regulatory sequences of the 

25 endogenous gene may be replaced by homologous recombination. As described herein, 
gene targeting can be used to replace a gene's existing regulatory region witii a regulatory 
sequence isolated ftom a different gene or a novel regulatory sequence synthesized by 
genetic engineering methods. Such regulatory sequences may be comprised of promoters, 
enhancers, scaffold-attachment regions, negative regulatory elements, transcriptional 

30 initiation sites, regulatory protein binding sites or combinations of said sequences. 
Alternatively, sequences which affect the structure or stability of the RNA or protein 
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produced may be replaced, removed, added, or otherwise modified by targeting. These 
sequences include polyadenylation signals, mRNA stability elements, splice sites, leader 
sequences for enhancing or modifying transport or secretion properties of the protein, or 
other sequences which alter or iinprove the function or stability of protein or RNA 
S molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple 
deletion of a regulatory element, such as the deletion of a tissue-specific negative regulatory 

10 element. Alternatively, the targeting event may replace an existing element; for example, a 
tissue-specific enhancer can be replaced by an enhancer that has broader or different 
cell-type specificity than the naturally occurring elements. Here, the naturally occurring 
sequences are deleted and new sequences are added In all cases, the identification of the 
targeting event may be facilitated by the use of one or more selectable marker genes that are 

15 contiguous with tiie targeting DNA, allowing for the selection of cells in which tiie 

exogenous DNA has integrated into the cell genome. The identification of the targeting 
event may also be facilitated by the use of one or more marker genes exhibiting the property 
of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such tiiat the negatively selectable marker flanks the targeting 

20 sequence, and such that a correct homologous recombination event with sequences in the 
host cell genome does not result in the stable integration of the negatively selectable marker. 
Markers usefiil for this purpose include the Herpes Simplex Virus tiiymidine kinase (TK) 
gene or die bacterial xantiiine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance 

25 with this aspect of the invention are more particularly described in U.S. Patent No. 

5,272,071 to Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; Intemational Application 
No. PCT/US92/09627 (WO93/09222) by Selden et al.; and Intemational Application No. 
PCT/US90/06436 (WO91/06667) by Skoultchi et al,, each of which is incorporated by 
reference herein in its entirety. 

30 

4.9 TRANSGENIC ANIMALS 
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In preferred methods to determine biological functions of the polypeptides of the- 
invention in vivo, one or more genes provided by the invention are either over expressed 
or inactivated in the germ line of animals using homologous recombination [Capecchi, 
Science 244:1288-1292 (1989)]. Animals in which tfie gene is over expressed, under the 
S regulatory control of exogenous or endogenous promoter elements, are known as 
transgenic animals. Animals in which an endogenous gene has been inactivated by 
homologous recombination are referred to as "knockout" animals. Knockout animals, 
preferably non-human manmials, can be prepared as described in U.S. Patent No. 
5,557,032, incorporated herein by reference. Transgenic animals are useful to determine 

10 the roles polypeptides of the invention play in biological processes, and preferably in 
disease states. Transgenic animals are useful as model systems to identify compounds 
that modulate lipid metabolism. Transgenic animals, preferably non-human mammals, 
are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28122, incorporated herein by reference. 

15 Transgenic animals can be prepared wherein all or part of a promoter of the 

polynucleotides of the invention is either activated or inactivated to alter the level of 
expression of the polypeptides of the invention. Inactivation can be carried out using 
homologous recombination methods described above. Activation can be achieved by 
supplementing or even replacing the homologous promoter to provide for increased 

20 protein expression. The homologous promoter can be supplemented by insertion of one 
or more heterologous enhancer elements known to confer promoter activation in a 
particular tissue. 

The polynucleotides of tfie present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to 
25 express polypeptides of the invention or that express a variant polypeptide. Such animals 
are useful as models for studying the in vivo activities of polypeptide as well as for 
studying modulators of the polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivoy one or more genes provided by the invention are either over expressed 
30 or inactivated in the germ line of animals using homologous recombination [Capecchi, 
Science 244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the 
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regulatory control of exogenous or endogenous promoter elements, are known as 
transgenic animals. Animals in which an endogenous gene has been inactivated by 
homologous recombination are referred to as "knockout" animals. Knockout animals, 
preferably non-human mammals, can be prepared as described in U.S. Patent No. 
5 5.557,032, incorporated herein by reference. Transgenic animals are useful to determine 
the roles polypeptides of the invention play in biological processes, and preferably in 
disease states. Transgenic animals are useful as model systems to identify compounds 
that modulate lipid metaboUsm. Transgenic animals, preferably non-human mammals, 
are produced using methods as described in U.S. Patent No 5.489.743 and PCT 
10 PubUcation No. W094/28122. incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of 
the mvention promoter is either activated or inactivated to alter the level of expiession of 
the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or 
15 even replacing tiie homologous promoter to provide for increased protein expiession. The 
homologous promoter can be supplemented by insertion of one or more heterologous 
enhancer elements known to confer promoter activation in a particular tissue. 



20 



4.10 USES AND BIOLOGICAL ACTIVITY 

The polynucleotides and proteins of the present invention are expected to exhibit 
one or more of the uses or biological activities (including those associated with assays 
cited herein) identified herein. Uses or activities described for proteins of the present 
invention may be provided by administration or use of such proteins or of 
polynucleotides encoding such proteins (such as. for example, in gene tiierapies or 
25 vectorssuitableforintroductionofDNA). The mechanism underlying die particular 
condition or pathology will dictate whetiier the polypeptides of the invention, tiie 
polynucleotides of the invention or modulators (activators or inhibitors) tiiereof would be 
beneficialtothesubjectinneedoftieadnent. Thus, "therapeutic compositions of the 
invention" include compositions comprising isolated polynucleotides (including 
. recombinant DNA molecules, cloned genes and degenerate variants tiiereof) or 
polypeptides of the invention (including fuU length protein, mature protem and 
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„.e«„acavi.yo,te.arge.g.„ep™du«..d*cr«*e.evelof«n«e=n»^^ 
„™«sion or «,g.c protein activity. Such B«>dala»« include polypeptides. ».logs. 

:i:,inc,»i;f.^e„..nd^onp™.in.,anti,x..s«.»---^^ 
5 Lo.leon>pcnndsti.a.di,ectiyori„di«c«yactiva»o,inl^tti«,«^^<**» 
invention(identifi=d,e.g,vi.d™gsc.ee»ngassaysasdescnbedhem.);»««ea« 

;^.leLdes.nd^ynnc,eotides.Ua.lefo.ti.p.e.»..f<»-i-^ 
LLes«oti»l.indinspa«ne«.h.sp.cific..ly.ecogni.-«"^ 

the twlvpeptides of the invention. 
X„ ^polypeptidescf a-ep^tinventionmayli^wisebeinvolvedmceauUr 

«iv«io« « in ooe »f .he oa»r ph,sio)oEic.l p«l.«.ys descdbed hete-n. 

410.1 RESEARCH USES AND UnUTIES 
mpoly«aeleotid«ptovidedl,yftep«««t invention canbc used by .he 

,5 ^hconununityforvariouspurpc^es. T,„po,ynucteotidesc.nbeu.ed.o«p«ss 
^„„Wnan,pro»in,o,«».y.U.c*arac.eHz-iono.d«»pe»tie.»;a.m^^^ 
as.„e.in*chti«co.«»pondingp««einis,«fe««i*eM««»ed(a^^ 
consti.u.ivelyo,a.apa«icuurs«seoftissuedi«e»ti«iono,de«lopn»«^^^^ 
s.a..);a..olecu.arweigh.n«*ers.m^;..cl«on,o«»neti»*»or»^w^ 

20 ,aheM.oide„tifychron,oson«or»n«,,.eta«dgenep«itions;«,compa»w«h 

o«sPNA.^ue„cesinpatien.»identi.ypo«ntia.genetiodiso..^^^ 
XLdi»andti,usdiscovernove,..l«edDNA.e^-.-so».of.«^on 

:LePCRpHnie.fo,ge.eticr.nge.ptinting;.ap«*e»"aut»«.-o«-^o^ 
«^n..inti»P«ce.sofaiscovertnsoti.novelpo,,nucleotides-.f««.e.^^ 

« n««ngoligon«sfor.«ach™en.,oa"genecMp-o,o.her»PPO«.»»l.-«g 
exan.iLtilfexp««ionpa«e»..orai.anti-p..ina„tibodl«"ngD^^^ 
,^on«chni.^;andasana„tigen.oraiseanti.DNAantibo.«o,el,c.^ 

i^-uneresponse. Where ti» polynucleotide encodes a p.o»inv,h.ch*n*« 
p«^yWnds«-«.*erp««in(a..h.s.fcrexample.ina.ecep«-hgand 

30 L.eractil),ti.pol,nuc,eotideoanals„l.usedinin»acti„n»apa^^^^^ 
e^te.U».descnbedinOyutis...a..Cen 75:791-803 (.993»».ae.til, 
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10 



polynucleotidesencodingtheotherproteinwithwMchbin*^^ 

inhibitors of the binding interaction. 

The polypeptides provided by the presM invention CM. annlriy be used in 

«ays to detetnine biological activity, including in a pa»a of tnultiple p^teina for 
Ugh-throo^P". screening; to raise antibodies or to ettci, another inananne response; «. a 
,eage«.(i.«Mng«» labeled reagent) in assays designed toquantitattvelyde,.™* 

levels of the protein (or its r«»ptor) in biological fltnds; as n,a.tos for tissues »- 
toco..es,«ndi,.gpol,p.ptidei.p«ferenttallyexpress«l(eiti>.constitutivelyor«. 

particiiar stage of tisst^ diffem.tia.ion or development or in a disease state); and. of 
cours...oisola<ec„.rcl.tive«cept«sorlig».ds. Proteins involved in these bindmg 
i,,«^c«...sob.useatosc.eenforp.p.ideorsn»U.«.ecuteinhibi«» 

of the binding interaction. 

Any or all of these xesem:h utilities aie capable of being developed into reagent 

grade or kit format for coimnercialization as research products. 
15 MethodsforperfomungtheuseslistedaboveareweIlla.o^tothosesldlIedm 

theart. References disclosing such methods include without Unutation "Molecular 

Cloning: A Laboratory Manual", 2d ed.. Cold Spring Harbor Laboratory Press. 
Sambroolc.J..E.F.FritschandT.Maniatiseds., 1989.and -methods inEnz^^^^^^ 

Guide to Molecular Cloning Techniques". Academic Press. Berger. S. L. and A. R. 

20 Kimmel eds., 1987. 

4.10:1 NUTRITIONAL USES 

Polynucleotidesandpolypeptidesofthepresentinventioncanalsobeusedas 
nutritional sourcesorsupplements. Such uses include without limitation use as a protem or 

25 aminoacidsupplement,useasaca*onsource.useasanitrogensourceanduseasaso^ 
ofcarbohydrate. In such cases the polypeptide or polynucleotide of the inventrono^ 
addedtothefeedofaparticularorganismorcanbeadnunisteredasaseparatesobdo^^ 
preparation, such as in theformofpowder.pill. solutions, suspensions or capsu^^^^^^ 
Leof microorganisms, thepolypeptideorpolynucleotideoftheinventi 

30 the.mediuminoronwhichthemicroorganismiscultured. 
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4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTUTION 
ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, 
cell proliferation (either inducing or inhibiting) or cell differentiation (either inducing or 

5 inhibiting) activity or may induce production of other cytokines in certain cell 

populations. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Many protein factors discovraed to date, including all known cytokines, have 
exhibited activity in one or more factor-depeaidait cell iMX)liferation assays, and hoice the 
assays serve as a convenient confirmation of cytokine activity. The activity of therapeutic 

10 coffipositions of the present invention is evidenced by any one of a number of routine 
factor dependent cell proliferation assays for cell lines including, without limitation, 32D, 
DA2. DAIG, TIO, B9, B9/11. BaF3, MC9/G, M4<preB M+), 2E8, RB5, DAI, 123, 
T1165, HT2, CTLL2, TF-1, Mo7e, CMK, HUVEC, and Caco. Therapeutic compositions 
of the invention can be used in the following: 

15 Assays for T-cell or thymocyte proliferation include without limitation those 

described in: Cunent Protocols in Immunology, Ed by J. E. Coligan, A. M. Kraisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
Wiley-Inteiscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Takai et al.. J. Immunol. 137:3494-3500, 

20 1986; BertagnolU et al., J. Immunol. 145:1706-1712, 1990; Bertagnolli et al.. Cellular 
Immunology 133:327-341, 1991; BertagnolU, et al., L Immunol. 149:3778-3783, 1992; 
Bowman et al.; L Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node 
cells or thymocytes include, widiout limitation, those described in: Polyclonal T cell 

25 stimulation, Kruisbeek, A. M. and Shevach, E. M. In Cun«nt Protocols in Immunology. 
J. E. e,a. CoUgan eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and 
Measurement of mouse and human interleukin-Y, Schieibear, R. D. In Current Protocols in 
Immunology. J. E. e.a. CoUgan eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 
1994. 

30 Assays for proliferation and differentiation of hematopoietic and lynqihopoietic 

cells include, without Umitation, those described in: Measurement of Human and Murine 
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Interleukin 2 andlnterleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current 
Protocols in Immunology. J. R e.a. Coligan eds. Vol 1 pp. 6,3.1-6.3.12, John Wiley and 
Sons, Toronto. 1991; deVries et al., J. Exp. Med. 173:1205-1211, 1991; Moreau et al.. 
Nature 336:690-692, 1988; Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 
5 80:2931-2938, 1983; Measurement of mouse and human interleukm 6-Nordan, R. In 
Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 6.6.1-6.6.5, John Wiley 
and Sons, Toronto. 1991; Smith et al., Proc. Natl. Aced. Sci. U.S.A. 83:1857-1861, 1986; 
Measurement of human Interleukin 1 1— Bennett, F., Giannotti, J., Clark, S. C. and Turner, 
K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 6.15.1 John 

10 Wiley and Sons, Toronto. 1991 ; Measurement of mouse and human Interleukin 
9-Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in 
Iinmunology. J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 
Assays for T-cell clone responses to antigens (which will identify, among others, 
. proteins that affect APC-T cell interactions as well as direct T-cell effects by measuring 

15 proliferation and cytokine production) include, without limitation, those described in: 
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. 
Margulies, E. M. Shevach, W Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function; Chapter 
6, Cytokines and their cellular receptors; Chapter 7, Immunologic studies in Humans); 

20 Weinberger et al., Proc. Natl Acad. Sci. USA 77:6091-6095, 1980; Weinberger et al., 
Eur. J. Immun. 11:405-411, 1981; Takai et al., J. Immunol 137:3494-3500, 1986; Takai 
et al., J. Immunol. 140:508-512, 1988. 

4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

25 A polypeptide of the present invention may exhibit stem cell growth factor 

activity and be involved in the proliferation, differentiation and survival of pluripotent 
and totipotent stem cells including primordial germ cells, embiyonic stem cells, 
hematopoietic stem cells and/or germ line stem cells. Administration of the polypeptide 
of the invention to stem cells in vivo or ex vivo is expected to maintain and expand cell 

30 populations in a totipotential or pluripotential state which would be useful for le- 
engineering damaged or diseased tissues, transplantation, manufacture of bio- 

49 



wo 02/081731 



PCT/US02/01222 



pharmaceuticals and the development of bio-sensors. The ability to produce large 
quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, 
implantation of cells to treat diseases such as Parkinson's, Alzheimer's and other 
5 neurodegenerative diseases; tissues for grafting such as bone marrow, skin, cartil^e, 
tendons, bone, muscle (including cardiac muscle), blood vessels, cornea, neural cells, 
gastrointestinal cells and others; and organs for transplantation such as kidney, liver, 
pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or 

10 cytokines may be administered in combination with the polypeptide of the invention to 
achieve the desired effect, including any of the growth factors listed herein, other stem 
cell maintenance factors, and specifically including stem cell factor (SCF), leukemia 
inhibitory factor (LIF), Flt-3 ligand (Flt-3L), any of the interleukins, recombinant soluble 
IL-6 receptor fused to IL-6, macrophage inflammatory protein 1-alpha (MlP-l-alpha), G- 

15 CSF, GM-CSF, thrombopoietin (TPO), platelet factor 4 (PF-4), platelet-derived growth 
factor (PDGF), neural growth factors and basic fibroblast growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, 
expansion of these cells in culture will facilitate the production of large quantities of 
mature cells. Techniques for culturing stem cells are known in the art and administration 

20 of polypeptides of the invention, optionally with other growth factors and/or cytokines, is 
expected to enhance the survival and proliferation of the stem cell populations. This can 
be accon:^lished by direct administration of the polypeptide of the invention to the 
cultitte medium. AltOTiativcly, stroma cells transfected with a polynucleotide that 
encodes for the polypeptide of the invention can be used as a feeder layer for the stem 

25 cell populations in culture or in vivo. Stromal support cells for feeder layers may include 
OTibryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to 
induce autocrine expression of the polypeptide of the invention. This will allow for 

30 generation of undifferentiated totipotential/pluripotential stem cell lines that are useful as 
is or that can then be differentiated into the desired mature cell types. These stable cell 
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lines can also serve as a source of undifferentiated totipotential/pluripotential mRNA to 
create cDNA libraries and templates for polymerase chain reaction experiments. These 
studies would allow for the isolation and identification of differentially expressed genes 
in stem cell populations that regulate stem cell proUferation and/or maintenance. 
5 Expansion and maintenance of totipotent stem ceU populations will be useful in 

the treatinent of many pathological conditions. For example, polypeptides of the present 
invention may be used to manipulate stem cells in culture to give rise to neuroepitheUal 
cells that can be used to augment or replace cells damaged by illness, autoimmune 
disease, accidental damage or genetic disorders. The polypeptide of the invention may be 
10 useful for inducing the proliferation of neural cells and for tiie regeneration of nerve and 
brain tissue, i.e. for the treatment of central and peripheral nervous system diseases and 
neuropathies, as well as mechanical and traumatic disorders which involve degeneration, 
death or trauma to neural cells or nerve tissue. In addition, the expanded stem cell 
■ populations can also be genetically altered for gene therapy purposes and to decrease host 
15 rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem ceUs can also 
be manipulated to achieve conteolled differentiation of the stem cells into more 
differentiated cell types. A broadly appUcable method of obtaining pure populations of a 
specific differentiated cell type from undifferentiated stem cell populations involves the 
20 useofacell-typespedficpromoterdrivingaselectablemarker. The selectable marker 
allows only cells of the desired type to survive. For example, stem ceUs can be induced 
to differentiate into cardiomyocytes (Wobus et al.. Differentiation. 48: 173-182. (1991); 
Klug et al., J. CUn. Invest. 98(1): 216-224. (1998)) or skeletal muscle cells (Browder, L. 
W. In: Principles of Tissue Engineering eds. Lanza et al.. Academic Press (1997)). 
25 Alternatively, directed differentiation of stem cells can be accompUshed by culturing the 
stem cells in tiie presence of a differentiation factor such as retinoic acid and an 
antagonist of the polypeptide of the invention which would inhibit die effects of 
endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro culmres of stem cells can be used to determine if ttie polypeptide of the 
30 invention exhibits stem cell growth factor activity. Stem cells are isolated from any one 
of various cell sources (mcluding hematopoietic stem cells and embryonic stem ceUs) and 
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cuU„.donafeederlay«.asdcscribedbyTl>oopsone.al.Proc.N«l.Acad.Sc..U.S^. 
92- 7S44.7848(1995)antepre«»c.of*e polypeptide of ««inv«tiond™»ar«. 

coo^ination »iti, chergrow* teors orcy«,«.cs.-n» a«B«,of *epo.,pe,«dcof *e 
ta,e«ion«.md„ces«m«U.proMe,atio„i.de«™»=dWoo.o«,fonna»™ 
, described by Ben>stein et al.. Blood, 77: 2316-2321 (1991). 



5 solid support e.g. as i 



4 10 J HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of fte p.e«„t invention may be involved in .egulaBon of 
K.m«o^esisand.con«,»enU,>ti««atinen.ofmyeloido..yn^hoideendia^ 

10 B.enm.rgimlbiologie«l»«vi.yina»ppor.ofcolonyfonmngcellsorof 

te«.dependen.cdl lines indica^s involvement in «8«latinghen«ti>poiea,s. e.E.,n 

s«pp«ti„g.l«gto»ti,»K.p™lifc.«io»»f«^^P-8=^'"«"^°'7.7° ^ 
o<»binatio„ »i* oa«r cytoldne.. .he«b, i™iic«ing utiBty, for example, tn bating 
v.rio«s«.«ni.sorfor»seinco,^«nction»itbi™a«ion/chem«° — *^ 
,5 pr„duc«„nofery.Wdp»en«o«».d/ore.,.taoidcens;in.upporti.g.to^^^ 
p„Hfer«ion o, myeloid odls aueb as granulocytes and monocytes/tnaemphages 
,.di«on.lCSFactivi,y)useful,farexample.i„conJn.ction«d,chemod^yto^^^^ 
^ventor«eatconse,uen.myelo.upp.essio.;i.snppo«ingthe^»-P«^^ 
ofmegalcaryocy.esa„dconse,ue.Uyof0«eleua««l.y»"ov«ngp«»««io.or 

20 lJe!^vLsp,a.e,e.diaorde.s^.a.h.o™e*.«^^^*^ 

pUcecforcomp™u,p.atele.tr««.^o»;and/«ins„ppo.ting.hcg»«.h-^^ 

Ueration of hematopoietic s,emcenswhich«.c.p.bleof,n«»ring.oj^^ 
Lsbove-mentio,»dhematopoieticce.l.andthe...b.efind.he«p«mcu^.^v^^ 

«««cendisorders(suchasti,oseusu.nyt.a.edwi.h.ranspl««ati«. — ™^ 
25 W.«io„.aplastica„emiaandpa«>xysmalnoc,amalhenK.globinuri.),.s«ell«m 
lpopul.ti„g.l«stemcel,compa«men.po,,taadiation.cl«motherap,,e,^ 

(i.e.. in conjunction with bone manow ^ansplantadon or witi. penpheral 
pogem,^ con t^nsplantation (homologons or hetelologons)) as nonnal cells or 

genetically manipulated for geoe tiierapy. 
30 The«p.«ttccompositionsoftheinvenSoncanbeusedin.hefollowmg: 
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Sutable as^ys for prolifc«i<» <liff«««i»'<>» "t """"-^^ ^ 

are cited above. ^ ^tu^«, 

Assays for embryomcs»m ecu diff«e»a.«<».(«W=h»ffliae"<.*y.>™>"6'>*^ 

5 .WtaUon. .hose described in: Johansson « ... Cellular Biolog, 15:141-15 , 1995^ KeUer 
e, al.. Molecular and Cellular Biology 13:473486. 1993; Mca««h«. e. al.. Blood 

81:2903-2915, 1993. 

As«.„ te stem cell survival and differentiaaon (which «ill idenhly, among 
oU«^p,«eins*..»gula«lyn*ho*ema.opoicsis)include.«idK«tlin»»^^ 
,0 descnbedin:Me*ylcdlulosecdl«,,fonningassa,s,F.eshney.M.O.InOrl»eof 

Hem^opoi^ic CeUs. R. L P^,. e. al. eds. Vol „. 26^268. -^^^^-^^^ 
yortN.Y.1994;Hir.y.n.e.-„P.oc. Nad. Acad. Sci. USA 89:5907-5911 im 

„hen«.opoleaccolonyfornnngcellswi.hhighproliferativepo«n»alMcN>^^ 
lK««iBriddell.R.A.InCulm«ofHen»u^o cells. R.I.Freshney,e.al. eds. vol 
15 pp 23-39. Wileyliss,Inc.,Ne«Vo,k.N.Y.1994;Nebenelal..Experime„tal 

La»logy22:353-359.1994;Cohbles.o„e.reaf0nningcellassay.Ploem.ch«.R^ 
^Odh«e!,Hen«.opoie.cC«ls.R.LBe.hne,.e.al.ed. vol,.. 1-21. W.ley-U^ 
,„c:,NewYo,k.N.Y.1994;long»nnb<me,na„owcul»esin««pres«Keofst^^^ 

cells Spooncer. E.. Dex»r, M. and Allen. T. In 0*» of HemaWpoieSc CeUs R. I 
. iL:y,e.al.eds. Volpp. I«.n9. Wile^.»..New Y^^N.Y.^^I^- 
culu^e ini^ng cell assay. Suteland. H. J. In Cuhure of H»na«^««2"" 
ftesbney. et d. eds. Vol pp. 139-162. Wiley-Uss. Inc.. New Yoric N.Y. 1994. 



25 



4 10.6 TISSUE GROWTH ACTIVITY 

A polypeptide of d.e present invention also may be involved in hone. cartUage, 
^.Ug.menta.d,ornervetissuegrowti,orregeneradon..s«dlasin»ou.dheabng 

and tissue r^ and replacement, and in healing of bums, incisions and ulcets. 

Apolypeptide of d»=presen. invention which induces cartilage and/or bone 

g™,*inci,cmns,»,ces«he,eh«,eisno,„orBaIlyformed,hasapplicationmU» 
30 Lngofbonefraotirresandcartilagedamageordefectsinhumansandoti^^^^^ 
compositions of.polypepttde.«>tibody.bi™lingparm«. or oti»r modulator of .he 
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invention may have prophylactic use in closed as well as open fracture reduction and also 
in the improved fixation of artificial joints, De novo bone formation induced by an 
osteogenic agent contributes to the repair of congenital, traunia induced, or oncologic 
resection induced craniofacial defects, and also is useful in cosmetic plastic surgery. 

5 A polypeptide of this invention may also be involved in attracting bone-fonmng 

cells, stimulating growth of bone-forming cells, or inducing differentiation of progenitors 
of bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative 
disorfeis, or periodontal disease, such as througji stimulation of bone and/or cartilage 
repair or by blocking inflammation or processes of tissue destruction (coUagenase 

10 activity, osteoclast activity, etc.) mediated by inflammatory processes may also be 
possible using the composition of the invention. 

Another category of tissue regeneration activity that may involve the polypeptide 
of the present invention is tendon/ligament formation. Induction of tendonAigament4ike 
tissue or other tissue formation in circumstances where such tissue is not normally 

15 formed, has application in the healing of tendon or ligament tears, deformities and other 
tendon or ligament defects in humans and other animals. Such a preparation employing a 
tendon/ligament-like tissue inducing protein may have prophylactic use in preventing 
damage to tendon or ligament tissue, as well as use in tiie improved fixation of tendon or 
ligament to bone or other tissues, and in repairing defects to tendon or ligament tissue. 

20 De novo tendon/ligament-like tissue formation induced by a composition of the present 
invention contributes to the repair of congenital, trauma induced, or other tendon or 
ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention 
may provide environment to attract tendon- or ligament-forming ceUs, stimulate growtii 

25 of tendon- or ligament-forming cells, induce differentiation of progenitors of tendon- or 
ligament-forming cells, or induce growth of tendon/ligament cells or progenitors ex vivo 
for return in vivo to effect tissue repair. The compositions of the invention may also be 
useful in the treatment of tendinitis, carpal tunnel syndrome and other tendon or ligament 
defects. The compositions may also include an appropriate matrix and/or sequestering 

30 agent as a carrier as is well known in the art. 
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The compositions of the present invention may also be useful for proliferation of 
neural cells and for regenieration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and 
traumatic disorders, which involve degeneration, death or trauma to neural cells or nerve 
S tissue. More specifically, a composition may be used in the treatment of diseases of the 
peripheral nervous system, such as peripheral nerve injuries, peripheral neuropathy and 
localized neuropathies, and central nervous system diseases; such as Alzheimef s, 
Parkinson's disease, Huntington's disease, amyotrophic lateral sclerosis, and Shy-Drager 
syndrome. Further conditions which may be treated in accordance with the present 
10 invention include mechanical and traumatic disorders, such as spinal cord disorders, head 
trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies resulting 
from chemotherapy or other medical therapies may also be treatable using a composition 
of the invention. 

Compositions of the invention may also be useful to promote better or faster 
15 closure of non-healing wounds, including without limitation pressure ulcers, ulcers 
associated with vascular insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of pther tissues, such as organs (including, for example, pancreas, liver, 
intestine, kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular 
20 (including vascular endothelium) tissue, or for promoting the growth of cells comprising 
such tissues. Part of the desired effects may be by inhibition or modulation of fibrotic 
scaning may allow normal tissue to regraerate. A polypeptide of the present invention 
may also exhibit angiogenic activity. 

A composition of the present invention may also be useful for gut protection or 
25 regeneration and treatment of lung or liver fibrosis, reperf usion injury in various tissues, 
and conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or 
inhibiting differentiation of tissues described above from precursor tissues or cells; or for 
inhibiting the growth of tissues described above. 
30 Therapeutic compositions of the invention can be used in the following: 
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Assays for tissue generation activity include, without limitation, those described 
in: International Patent Publication No. WO95/16035 (bone, cartilage, tendon); 
International Patent Publication No. WO95/05846 (nerve, neuronal); International Patent 
Publication No. WO9iy07491 (skin, endotheliuna). 
S Assays for wound healing activity include, without limitation, those described in: 

Winter, Epidermal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T., eds.), 
Year Book Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. 
Invest Dermatol 71:382-84 (1978), 

10 4.10.7 IMMUNE STIMULATING OR SUPPRE^ 

A polypeptide of the present invention may also exhibit immune stimulating or 
immune suppressing activity, including without limitation the activities for which assays 
are described herein. A polynucleotide of the invention can encode a polypeptide 
exhibiting such activities. A protein may be useful in the treatment of various immune 

15 deficiencies and disorders (including severe combined immunodeficiency (SCID)), e.g., 
in regulating (up or down) growth and proliferation of T and/or B lymphocytes, as well as 
effecting the cytolytic activity of NK cells and other cell populations. These inmnme 
deficiencies may be genetic or be caused by viral (e.g., HIV) as well as bacterial or 
fungal infections, or may result from autoimmune disorders. More specifically, infectious 

20 diseases causes by viral, bacterial, fungal or other infection may be treatable using a 
protein of the present invention, including infections by HIV, hepatitis viruses, herpes 
viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be 
useful where a boost to the immune system generally may be desirable, i.e., in the 

25 treatment of cancer. 

Autoimmune disorders which may be treated using a protein of the present 
invention include, for example, connective tissue disease, multiple sclerosis, systemic 
lupus erythematosus, rheumatoid arthritis, autoimmune pulmonary inflammation, 
Guillain-Barre syndrome, autoimmune thyroiditis, insulin dependent diabetes mellitis, 

30 myasthenia gravis, graft-versus-host disease and autoinunune inflaimnatory eye disease. 
Such a protein (or antagonists thereof, including antibodies) of the present invention may 

56 



PCT/IJS02/01222 

WO 02/081731 



.,so»beusefuUn*e»aU„e„,ofan.,^c«cd».»dc<».atio,>s(..|..««ph,to«. 

^ sickn^s. dmg .eacdons, food allergies, msec, venom 

ate* Humtis. hype,se,«ittvi.y pneumonitis. urUcaria. angioedema, ec«m.. «opc 

Zaas...«.cco...c.de™^.s,e.,U.emamo,«on.,S»ve^J<*^s^^ 

conimKdvids. atopic ke^^oonjunodvids. venereal keratoconjoncd^ds. g.«.. 
™^conj«„cdvids»K.con»c....e,giea).sdchasa^a(pa«ica,arIyall^^ 

LL)orod»r.espir3««ypro«ems.O««rcondidons,inwMeKi.™».^^^ 
*^(i«,uding.fore«mp.e,o.g»dansp>anUdon),mayalsobedea.ab.e.^^^^^^ 
p,o«in(oran««onis.sd»>eoOofd»P«sen.invendon.T1.ed«.ape»ceffec«o,*e 
, p„,^desoran«>gpnis«d«.eo,on.nergic«»«--teeva.„a.edby„v,vo 

^models a^asfl,.oun».advec<>n»«enha»emen,«..(l^»en^^^^^ 

im.»i-pigs«nse.sidz«ion«a.(Vd«e..i.,A.eh.«73,^l-«^^^^^^ 
modneto. ,ymphnode.ss.,(Kimbere.aL.>.ToxicoI.E.«,on.Heald, 53 563-79). 

5 „,„Xp,o«insoffteinve„do„i.m.y..sol.p<«sl*»-^»'« — 

.sp^nses.inannmb.rofw.ys.Do^.gn.ad-n^'-tad.e.onnofinhiWdngor 

MoeBnganimmu„e.esponse,be«iyinprog,esaorma,invo.vep»ve„«ng««^^^^ 
,^odlo,ani™n„neresponse.m~of.edv.»dTeeUa»ybed*^b.«db, 

snppressingToeU responses or by indudngspedfic«le«^inTedls.c,bo«,. 

iO tam„nca,pp«ssionotTceUrespo„sesis^neraBy...od,e,»on-».dge.«^ 

p^„Mchre,»irescond„uo^exposureofd»TeeU.»a-.»H--«»»<- 

Tole«mce. which involves inducing non-responsiveness or anHgy » T cdls. « 
asdngnlshableJromimmu«os»ppressionind»<i.isgeneraa,andgen.specac»Kl 

;ir.«erexpos»re»d«ro.erizingagenrhascea^.0^o.3n,^U*»nceo»^ 
25 Lons«a«db,*e.ackof.Tcen.es^se„pon.eexposure».specific..«genma,e 

absence of the tolerizing agent. 

Do™.eguUdngorp«vendngoneormoreandgenfuncdons(mcludmgv«d.o« 

Undudon B ,yn.phoc« andgen fu„cdc«s (such as. ,«r example. B7)). 
bighle,elly.nph„Bnesynd«sisby»:d.«edTcd.s,«mbeuseMms,nrado..o,ssue. 

30 sl«,do^dansp.a«adonanding,*.vers»s-hos.disease(GVHD).Forexamp.e. 
blockage of T ceB fimcdon *o«ld ,«ult in .»teed dssue destrucdon in dssoe 
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UansplantaUon. Typically, in to„e ttansptots. «j«aon of fte Wnsptat « i«mttd 
Um>ugh its recognitioi. as foreign by T cells. foUowei by an immune ««.on that 
destroys the transpUnt. The administraUon of a therapeutic composition of Ae .nventton 
my prevent cytokine synthesis by immune cells, such as T cells, and thus acts as ». 
5 immunosupp..ss»tMo,eover.alackofc„sUmula«o„mayalsobesufBcien..oanerg.» 

the T cdls. therd,, indudng .<*=rance in a suljec.. Induction of long-tenn .oletance by B 
,™^br«y.eantige,,*locldng.«gen,s may avoid the necessity of repeated admimsw^ 

these blocking t^. To achieve «rfficient immunosupptession or tolerance m a 
subject. i.m.y.lsobenecessaryt.bk«kthef»,^onofacombinationofBlymphocy.e 

10 antigens. 

He of particular Aentpeutic compositions in preventing organ transplant 

rdection „ GVHD can be assessed using »nmal models that a,, predictive of efficacy m 
tuo^Examples of appropdate systems ^hichcanbeusedinclude allogeneic canlrac 
pafts in rats and x«»ge«eic p«K.««ic isle. ceU grafts in mice, both of Which have been 
15 „sedt„exami.etheimmun«upp,essiverffectsofCnA4l6fusionprot»nsinvivoas 
descdbed in Lenschow e. al. Science 257:789-792 (1992) mtdTuric. et al.. Ptoc. Nad. 
Acad Sci USA. 89:11 102-11105 (1992). In addition, murine models rf GVHD (see P»il 
Fundamental Immunology.RavenPress, New York. 1989.H.. 846-847) canbe used 

,„ determine the effect of d»rapeutic compositions of the inv«..io. on the development 

20 of that disease. 

Blocking antigen function may also be therapeutically useful for treating 
autoimmune diseases. Many autoimmune disorders are the result of inappropriate 
activation ofTcells that are reactiveagainst self tissue and which promote theproduc.^ 

of cytoldnes and autoantibodies involved in the pathology of the diseases. Prevenungth^ 
25 activationofautoieactiveTcellsmayreduceoreUminatediseasesymptoms. 

Administration of reagents which block stimulation ofTcellscanbe used to intabUT 
cell activation andprevent production of autoantibodies orTcell-derived cytokines 

which may be involved in the disease process. Additionally, blocking reagents may 
induce antigen-spedfic tolerance of autoreactiveTceUswWchcouldlead to 1^^^^^^ 

30 relief from the disease. The efficacy of blocking reagents in preventing or allev^aUng 
autoimmm.e disorders can be determined using a nmnber of weU^haracterized ammal 
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models of human autoimmune diseases. Examples include murine experimental 
autoimmuneencephalitis,systemiclupuserythmatosisinMRUlpr/lprmiceorNZB 

h>*rid mice, murine autoimmune collagen arthritis, diabetes memtusinNODm^^ 
BB rats, and murine experimental myasthenia gravis (see Paul ed.. Fundamental 
5 Immmiology. Raven Press. New York. 1989, pp. 840-856). 

Upregulation of an antigen function (e:g.. a B lymphocyte antigen function), as a 
n.eansofupregulatingimmuneresponses,mayalsobeuseft,lintherapy.Upregulationof 

immuneresponses may bein the fomiof enhancing an existing immune response or 
eHciting an initial immune response. example, enhancing an immune response may 
10 be useful in cases of ^ infection, including systemic viral diseases such as influenza, 
the anmnon cold, and encephalitis. 

AlBm«i»ely. anti-vinl inunune lesponses mayte ertumced in an infected patten. 
by,emovingTceIlsftomthepatie«,co8tin>nl.li«g.l.eToellsin vi.ro with vi«l 

a^igen-pnlsed APCs eidter exp.«»ng . peptide of dte pteaen. invendon or »•* 
,5 astim»la.or,fotn.ot.»l»blepepddeofd.ep...en.in«.donandr«ntroa,>einga.em 

viuo acdva»d T cells into the patient Another medtod of erintndng «.d-viral intmnne 
' respon.es wouldbe»isola.einfectedoellsfi«n.pad«>t,t«nsfecd«nwid..nnctoc 

acidencodingapro«inofd,epresen.inventionasdescnbedhe.«ns.chtha.d»cens 

express an or a portion of the protein on their sttrf.*. and reinmxtoc. «»«fec«d 

20 cells into the patient. lUe infected cells «-onld now be c.pd>le of deUvenng . 
costimulatory signal to, and theteby activate, T cells in vivo. 

A polypeptide of toe present invention may provide fl» necessaty stimutatKm 

signal to T ceUs to induce a T cell ntediated immune tesponse again,, flte «mteaci 
,^oells.ln addition, ntmor cells which laclcMHCclassIorMHCdassnn»leoules. 

25 or«hichf.il«..e«.p.««sufficie„.monn.sofMHCclassIorMHCclassnn»lecul.a, 
am be transfected with nucWo acid encoding all or a portion of (e.g., a 
c^oplasmic-domain timtcated portion) of an MHC class I alpha chain pro.ein and 
microglotalinp,oteinoranMHCcl.ssnalphach.inp.o»inandanMHCclassnbe.a 

chain protein to tfteteb, expr«« MHC class I or MHC class D pn.teins on dte ceU 
30 s,nface.E.p..ssionofti.e.ppn^a.echssIorclassnMHCinco„joncUonw,tha 
peptid.havi.gdte.ctivi.yofaBlymphocy.eantigen(.g.,B7.l.B7-2.B7.3)inducesa 
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T cell mediated immune response against the transfected tumor cell. Optionally, a gene 
encoding an antisense construct which blocks expression of an MHC class n associated 
protein, such as the invariant chain, can also be cotransfected with a DNA encoding a 
peptide having the activity of a B lymphocyte antigen to promote presentation of tumor 
5 associated antigens and induce tumor specific inmiunity. Thus, the induction of a T cell 
mediated inmiune response in a human subject may be sufficient to overcome 
tumor-specific tolerance in the subject. 

The activity of a protein of the invention may, among other means, be measured 
by the following methods: 

10 Suitable assays for thymocyte or splenocyte cytotoxicity include, without 

limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 
M. Kruisbeek, D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing 
Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte 
Function 3.1-3.19; Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. 

15 Natl. Acad. Sci. USA 78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 
1982; Handa et al., J. Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 
137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988; Bowman et al., J. 
Virology 61:1992-1998; Bertagnolli et al., CeUular Immunology 133:327-341, 1991; 
Brown et al., J. Immunol. 153:3079-3092, 1994. 

20 Assays for T-cell-dependent inununoglobulin responses and isotype switching 

(which will identify, among others, proteins that modulate T-cell dependent antibody 
responses and that affect Thl/Th2 profiles) include, without limitation, those described 
in: Maliszewski, J. Immunol. 144:3028-3033, 1990; and Assays for B cell fimction: Jn 
vitro antibody production, Mond, J. J. and Brunswick, M. In Current Protocols in 

25 Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 3.8.1-3.8.16, John Wiley and Sons, 
Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, 
proteins that generate predominantly Thl and CTL responses) include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. 
30 Kruisbeek, D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing 

Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte 
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Function 3.1-3.19; Chapter 7, Iminunologic studies in Humans); Takai et al., J. Immunol. 
137:3494-3500, 1986; Takai et al., J, ImmunoL 140:508-512, 1988; BertagnoUi et al., J. 
Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins 
5 expressed by dendritic cells that activate naive T-cells) include, without limitation, those 
described in: Guery et al., J. Immunol. 134:536-544, 1995; Inaba et al.. Journal of 
Experimental Medicine 173:549-559, 1991; Macatonia et al.. Journal of Inomunology 
154:5071-5079, 1995; Porgador et al.. Journal of Experimental Medicine 182:255-260, 
1995; Nair et al.. Journal of Virology 67:4062-4069, 1993; Huang et al., Science 

10 264:961-965, 1994; Macatonia et al.. Journal of Experimental Medicine 169:1255-1264, 
1989; Bhardwaj et al., Journal of Clinical Investigation 94:797-807, 1994; and Inaba et 
al., Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, 
proteins that prevent apoptosis after superantigen induction and proteins that regulate 

15 lymphocyte homeostasis) include, without limitation, those described in: Darzynkiewicz 
et al.. Cytometry 13:795-808, 1992; Gorczyca et al.. Leukemia 7:659-670, 1993; 
Gorczyca et al., Cancer Research 53:1945-1951, 1993; Itoh et al.. Cell 66:233-243, 1991; 
Zacharchuk, Journal of Lmnunology 145:4037-4045, 1990; Zamai et al.. Cytometry 
14:891-897, 1993; Gorczyca et al.. International Journal of Oncology 1:639-648, 1992. 

20 Assays for proteins that influence early steps of T-cell commitment and 

development include, without limitation, those described in: Antica et al.. Blood 
84:111-117, 1994; Fine et al., CeDular Inmiunology 155:111-122, 1994; Galy et al.. 
Blood 85:2770-2778, 1995; Toki et al., Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 

25 4.10.8 ACTIVIN/INHIBIN ACTIVITY 

A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to 
30 stimulate the release of follicle stimulating hormone (FSH). Thus, a polypeptide of the 
present invention, alone or in heterodimers with a member of the inhibin family, may be 
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useful as a contraceptive based on the ability of inhibins to decrease fertility in female 
mammals and decrease spennatogenesis in male mammals. Administration of sufficient 
amounts of other inhibins can induce infertility in these manunals. Alternatively, the 
polypeptide of the invention, as a homodimer or as a heterodimer with other protein 
5 subunits of the inhibin group, may be useful as a fertility inducing therapeutic, based 
upon the ability of activin molecules in stimulating FSH release from cells of the anterior 
pituitary. See, for example, U.S. Pat. No. 4,798,885. A polypeptide of the invention may 
also be useful for advancement of the onset of fertility in sexually immature mammals, so 
as to increase the lifetime reproductive performance of domestic animals such as, but not 

10 limited to, cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be 
measured by the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: 
Vale et al., Endocrinology 91:562-572, 1972; Ung et al.. Nature 321:779-782, 1986; Vale 

15 et al.. Nature 321:776-779, 1986; Mason et al.. Nature 318:659-663, 1985; Forage et al., 
Proc. Nad. Acad, Sci. USA 83:3091-3095, 1986. 

4.10.9 CmMOTACTICVCIffiMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or 

20 chemokinetic activity for mammalian cells, including, for example, monocytes, 

fibroblasts, neutrophils, T-cells, mast cells, eosinophils, epithelial and/or endothelial 
cells. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Chemotactic and chemokinetic receptor activation can be used to mobilize or 
attract a desired cell population to a desired site of action. Chemotactic or chemokinetic 

25 compositions (e.g. proteins, antibodies, binding partners, or modulators of the invention) 
provide particular advantages in treatment of wounds and other trauma to tissues, as well 
as in treatment of localized infections. For example, attraction of lymphocytes, 
monocytes or neutrophils to tumors or sites of infection may result in improved immune 
responses against the tumor or infecting agent. 

30 A protein or peptide has chemotactic activity for a particular cell population if it 

can stimulate, directiy or indirecfly, the directed orientation or movement of such cell 
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population. Preferably, the protein or peptide has the ability to diiectly stimulate directed 
movement of cells. Whether a particular protein has chemotactic activity for a populatron 
of cells can be readily detennined by employing such protein or peptide in any known 

assay for cell chemotaxis. 

i Therapeutic compositions of the invention can be used in the following: 

Assays for chemotactic activity (which wiU identify proteins that induce or 
prev«,t chemotaxis) consist of assays that measure the ability of a protein to induce the 
nrigration of cells across a membrane as well as the ability of a protein to induce the 
adhesion of one cell population to another cell population. Suitable assays for movement 

0 and adhesion include, without limitation, those described in: Current Pix>tocols ,n 

Immunology.EdbyJ.E.Coligan.A.M.Knrisbeek.D.aMarguiles.E.M.Shevach,W. 

Strober. Pub. CSreene Publishing Associates and Wiley-lnterscience (Chapter 6.12, 
Measurement of alpha and beta Chemokines 6.12.1^.12.28; Taub et al. J. Clin. Invest. 
95 1370-1376, 1995; lind et al. APMB 103:140-146, 1995;MulleretalEur. J. 
.5 Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867. 1994; Johnston et 
al. J. of Immunol. 153:1762-1768, 1994. 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or 
20 thrombolysis or thrombosis. A polynucleotide of the invention can encode a polypeptide 
exhibiting such attiibutes. Compositions may be useful in treatinent of various 
coagulation disorders (including hereditary disorders, such as hemophiUas) or to enhance 

coagulation and other hemostatic events in treating wounds resulting from tiauma, 
surgeryorothercau8es.Acompositionoftheinventionmayalsobeusefulfordissolving 

25 orinhibitingfonnationofthrombosesandfortreatmentandpreventionofconditions 
resulting therefrom (such as. for example, infarction of cardiac and central nervous 

system vessels (e.g., stiroke). 

Therapeutic compositions of the invention can be used in the following: 
Assay for hemostatic and thrombolytic activity include, without limitation, those 
30 described in: Linet et al., J. Oin. Pharmacol. 26:131-140, 1986; Burdick et al.. 
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Thrombosis Res. 45:413-419, 1987; Humphrey et al.. Fibrinolysis 5:71-79 (1991); 
Schaub, Prostaglandins 35:467-474, 1988. 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

5 Polypeptides of the invention may be involved in cancer ceU generation, 

proliferation or metastasis. Detection of the presence or amount of polynucleotides or 
poljpeptides of the invention may be useful for the diagnosis and/or prognosis of one or 
more types of cancer. For example, the presence or increased expression of a 
polynucleotide/polypeptide of the invention may indicate a hereditary risk of cancer, a 

10 precancerouscondition,oranongoingmalignancy. Conversely, a defect in the gene or 
absence of the polypeptide may be associated withacancer condition. Identificatron of 

single nucleotide polymorphisms associated with cancer or a predisposition to cancer 
may also be useful for diagnosis or prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor ceU 
15 proHferation,inhibitingangiogenesis(growthofnewbloodvesselsti,atisnecessary 
support tumor growth)and/orprohibiting metastasis by reducing tumor cell motrUty.^ 

invasiveness, ^rapeutic compositions of ti.e invention may be effective in adult and 
pediatric oncology including in solid phase tumors/malignancies, locally advanced 
tumors human soft tissue sarcomas, metastatic cancer, including lymphaUc metastases. 
20 bloodcellmaUgnanciesincludingmultiplemyeloma,acuteandchronicleukemias.^d 

lymphomas, head and neck cancers including mouth cancer, larynx cancer and thyroid 
cancer, lung cancers including small cell carcinoma andnon-small cell cancers,breast 

cancers including small cell carcinoma and ductal carcinoma, gastrointestinal cancers 
including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and poly^^ 

25 associated witi. colorectal neoplasia, pancreatic cancers, liver cancer, urologxc cancers 
including bladder cancer and prostate cancer, malignancies of tiie female gemtal tract 
mcluding ovarian carcinoma, uterine (including endometrial)cancers. and solid 

the ovarian folUcle. kidney cancers including renal cell carcinoma, bram cancers 
including intrinsic brain tumors, neuroblastoma, astrocytic brain tumors, gliomas. 
30 metastatictumorceUinva«oninthecentralnervoussystem.bonecancersincludmg 

osteomas, skin cancers including malignant melanoma, tumor progression of human sbn 



64 



wo 02/081731 



PCTAJS02/01222 



Karposi's sarcoma. . 

M,pepUd«. po.ynucleo.des, or moduU.ors of polypop-idcs of *c .nvenuoa 
O»,«ii„gi.^*i»rs»d«iu,uh^ofa«biologica.acavi,yof*epo.ypepude^ 
5 tav6nto)m.,be.dminis«»d«.Ue..ca.cer.Tl,=rapeuUc«»positio„sca„be 
r..lI*-P-.c-.,eff.cavedo«gesa,<.e<.i.co.«^^^ 

10 without necessarily eradicating the cancer. 

T^co^«o„c™-s„be.d™*»«.in««r.pe^c^y~»<>""-- 
a„orUo„of»»ti-«mcercocWl. An»a.=.m=«coctaaili..mix«»of U.e 

;Jep.deo,nKX>„.«o.o,«»invcn«o„«i*o.=«n--i--*»«=-^^^^ 
Lay™aceu«ca>,,accep«...o«.«fordeUve.y.l1«-or».^c„^^ 
,5 aJ^«,^u„e„.U»u«„e.M..»cerdn.,,*».««n*no™u.*c««.^^ 

Ldas.~Unoo„b«o„«i«.««po.yP^--*^<>'*'-";"''"' 
i^,„de: AcdnomycmD. Anu„og.„««Mdc. Aspa.^ B,«,m,<^. B«*«. 
CaAoplaa., CannusUne, Chlorambucil. CisplaUn (d.DDP). Cclopho^^ 
C,Ji„eHa(Cy.osinearaH,oside),DacaH»^.I^«-y*m^ 
«cinHa.E3— epho.pha.s„aun.Bopo.ae(V,«13).B«»nd»e.^ 

HZ..ca(5.Pu,,Hu.a.nlde,Hydrox,u„a(Kyarox^de).H^^^ 
A,p*.2..tortcrooAlpha-2b,Uupro.ideaceU«(U.l«-.eleash,gf.c«.»»W. 

xl^. Mechlo^duminc HC. (ni»o^ mu.»rd). Melpbalan, M-P^pun.^ 

„ P^cc^bazinemSuepU^Ta^xifcncl^a.,™^^^ 

!Ie. V^ne ^ A^^acH- A-'"*- H-ameU-yl— e. ta«Hcu.™.Z. 
Mi««u.zo»^Pe«o«aUn.S™"S«ne.Tempo8ide.andVmdesmesulfa^ 

Io«iaUo^ fterapeuflc cooipositions of thein.«nlion may be usedfor 
p„phylac«c«..ofcancer.Ttoearehe.edi.ar,co»Stions and/or envir™^^ 

30 rauL(e.s.«po«e.oca.cl„og»s,K»»«i.«»-«-P«^^«- — 
lve.opl„gc.ncc..U,*r««aeci«»m.«ces.U™ybcbe^cUl«.«e«e«se 
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individuals with therapeutically effective doses of the polypeptide of the invention to 

reduce the risk of developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of 

the invention as a potential cancer treatment. These in vitro models include proliferation 
5 assays of cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, 

(1987) Culture of Animal Cells: A Manual of Basic Technique. Wily-Liss, New Yoik, 

NY Ch 18 and Ch 21), tumor systems in nude mice as described in Giovanella et al., J. 

Natl. Can. Inst., 52: 921-30 (1974), mobility and invasive potential of tumor cells in 

Boyden Chamber assays as described in Klldngton et al.. Anticancer Res., 17: 4107-9 
10 (1997), and angiogenesis assays such as induction of vascularization of the chick 

chorioallantoic membrane or induction of vascular endothelial cell migration as described 

in Ribatta et al., Intl. J. Dev. Biol., 40: 1 189-97 (1999) and Li et al., Clin. Exp. 

Metastasis, 17:423-9 (1999), respectively. Suitable tumor cells lines are available, e.g. 

from American Type Tissue Culture Collection catalogs. 

15 

4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide 
of the invention can encode a polypeptide exhibiting such characteristics. Examples of 

20 such receptors and ligands include, without limitation, cytoWne receptors and their 
ligands, receptor kinases and their ligands, receptor phosphatases and their ligands, 
receptors involved in cell-cell interactions and their ligands (including without limitation, 
cellular adhesion molecules (such as selectins, integrins and their ligands) and 
receptor/ligand pairs involved in antigen presentation, antigen recognition and 

25 development of cellular and himioral immune responses. Receptors and ligands are also 
useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without 
limitation, fragments of receptors and ligands) may themselves be useful as inhibitors of 
receptor/ligand interactions. 

30 The activity of a polypeptide of the invention may, among other means, be 

measured by the following methods: 
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Suitable assays for receptor-ligand activity include without limitation those 
described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, 
D. H. Margulies, E. M. Shevach, W, Strober, Pub. Gieene Publishing Associates and 
Wiley- Interscience (Chapter 7.28, Measurement of Cellular Adhesion under static 
5 conditions 7.28.1- 7.28.22), Takai et al., Proc. Natl. Acad. Sci. USA 84:6864-6868, 1987; 
Bierer et al., J. Exp. Med. 168:1145-1156, 1988; Rosenstein et al,, J. Exp. Med. 
169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 175:59-68, 1994; Stitt et al., 
CeU 80:661-670, 1995. 

By way of example, the polypeptides of the invention may be used as a receptor 

10 for a ligand(s) thereby transmitting the biological activity of that ligand(s). ligands may 
be identified through binding assays, affinity chromatography, dihybrid screening assays, 
BIAcore assays, gel overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists 
or a partial antagonist require the use of other proteins as competing ligands. The 

15 polypeptides of the present invention or ligand(s) thereof may be labeled by being 

coupled to radioisotopes, colorimetric molecules or a toxin molecules by conventional 
methods. ("Guide to Protein Purification" Murray P. Deutscher (ed) Methods in 
Enzymology Vol. 182 (1990) Academic Press, Inc. San Diego). Examples of 
radioisotopes include, but are not limited to, tritium and carbon-14 . Examples of 

20 colorimetric molecules include, but are not limited to, fluorescent molecules such as 
fluorescanune, or rhodanune or other colorimetric molecules. Examples of toxins 
include, but are not limited, to ricin. 

4.10.13 DRUGSCREEND^G 

25 This invention is particulariy useful for screening chemical compounds by using 

the novel polypeptides or binding fragments thereof in any of a variety of drug screening 
techniques. The polypeptides or fragments employed in such a test may either be free in 
solution, affixed to a solid support, borne on a cell surface or located intracellularly. One 
method of drug screening utilizes eukaiyotic or prokaryotic host cells which are stably 

30 transformed with recombinant nucleic acids expressing the polypeptide or a fragment 
thereof. Drugs are screened against such transformed cells in competitive binding assays. 
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Such cells, either in viable or fixed form, can be used for standard binding assays. One 
may measure, for example, the formation of complexes between polypeptides of the 
invention or fi-agments and the agent being tested or examine the diminution in complex 
formation between the novel polypeptides and an appropriate cell line, which are well 
5 known in the art. 

Sources for test compounds that may be screened for ability to bind to or 
modulate (i.e., increase or decrease) the activity of polypeptides of the invention include 
(1) inorganic and organic chemical libraries, (2) natural product libraries, and (3) 
combinatorial libraries comprised of either random or mimetic peptides, oligonucleotides 

10 or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include stmctural analogs of known compounds or 
compounds that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria 

15 and fungi), animals, plants or other vegetation, or marine organisms, and libraries of 

mixtures for screening may be created by: (1) fermentation and extraction of broths from 
soil, plant or marine microorganisms or (2) extraction of the organisms themselves. 
Natural product libraries include polyketides, non-ribosomal peptides, and (non-naturally 
occurring) variants thereof. For a review, see Science 252:63-68 (1998). 

20 Combinatorial libraries are composed of large numbers of peptides, 

oligonucleotides or organic compounds and can be readily prepared by traditional 
automated synthesis methods, PGR, cloning or proprietary synthetic methods. Of 
particular interest are peptide and oligonucleotide combinatorial libraries. Still odier 
libraries of interest include peptide, protein, peptidomimetic, multiparallel synthetic 

25 collection, recombinatorial, and polypeptide libraries. For a review of combinatorial 
chemistry and libraries created therefipom, see Myers, Curr. Opin. BiotechnoL 8:701-707 
(1997). For reviews and examples of peptidomimetic libraries, see Al-Obeidi et al., MoL 
Biotechnol 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol 1(1):1 14-19 (1997); 
Domer et al., Bioorg Med Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

30 Identification of modulators through use of the various libraries described herein 

pemiits modification of the candidate "hit" (or "lead") to optimize the capacity of the 
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"hit" to bind a polypeptide of the invention. The molecules identified in the binding assay 
are then tested for antagonist or agonist activity in in vivo tissue culture or animal models 
that are well known in the art. In brief, the molecules are titrated into a plurality of cell 
cultures or animals and then tested for either cell/animal death or prolonged survival of 
5 the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin 
or cholera, or with other compounds that are toxic to cells such as radioisotopes. The 
toxin-binding molecule complex is then targeted to a tumor or other cell by the specificity 
of the binding molecule for a polypeptide of the invention. Alternatively, the binding 
10 molecules may be complexed with imaging agents for targeting and imaging purposes. 

4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide 
e.g. a ligand or a receptor. The art provides numerous assays particularly useful for 

15 identifying previously unknown binding partners for receptor polypeptides of the 
invention. For example, expression cloning using manomalian or bacterial cells, or 
dihybrid screening assays can be used to identify polynucleotides encoding binding 
partneis. As another example, affinity chromatography with the appropriate inamobilized 
polypeptide of the invention can be used to isolate polypeptides that recognize and bind 

20 polypeptides of the invention. There are a number of different libraries used for the 
identification of compounds, and in particular small molecules, that modulate 
increase or decrease) biological activity of a polypeptide of the invention. Ligands for 
receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical 

25 except for the expression of the receptor of the invention: one cell population expresses 
the receptor of the invention whereas the other does not. The response of the two cell 
populations to the addition of ligands(s) are then compared. Alternatively, an expression 
library can be co-expressed with the polypeptide of the invention in cells and assayed for 
an autocrine response to identify potential ligand(s). As still another example, BIAcore 

30 assays, gel overlay assays, or other methods known in the art can be used to identify 

binding partner polypeptides, including, (1) organic and inorganic chemical libraries, (2) 
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natural product libraries, and (3) combinatorial libraries comprised of random peptides, 
oligonucleotides or organic molecules. 

The role of downstream intracellular signaling molecules in the signaling cascade 
of the polypeptide of the invention can be determined. For example, a chimeric protein in 
5 which the cytoplasmic domain of tfie polypeptide of the invention is fused to the 

extracellular portion of a protein, whose ligand has been identified, is produced in a host 
cell. The cell is then incubated with the ligand specific for the extracellular portion of the 
chimeric protein, thereby activating the chimeric receptor. Known downstream proteins 
involved in intracellular signaling can then be assayed for expected modifications i.e. 
10 phosphorylation. Other methods known to those in the art can also be used to identify 
signaling molecules involved in receptor activity. 

4.10.15 ANTI-INFLAMMATORY ACTIVITY 

Compositions of the present invention may also exhibit anti-inflammatory 

15 activity. The anti-inflammatory activity may be achieved by providing a stimulus to cells 
involved in the inflammatory response, by inhibiting or promoting cell-cell interactions 
(such as, for example, cell adhesion), by inhibiting or promoting chemotaxis of cells 
involved in the inflammatory process, inhibiting or promoting cell extravasation, or by 
stimulating or suppressing production of other factors which more directly inhibit or 

20 promote an inflammatory response. Compositions with such activities can be used to treat 
inflammatory conditions including chronic or acute conditions), including widiout 
limitation intimation associated with infection (such as septic shock, sepsis or systemic 
inflammatory response syndrome (SIRS)), ischemia-reperfiision injury, endotoxin 
lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 

25 chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting 
from over production of cytokines such as TNF or IL-1. Compositions of the invention 
may also be useful to treat anaphylaxis and hypersensitivity to an antigenic substance or 
material. Compositions of this invention may be utilized to prevent or treat conditions 
such as, but not limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced 

30 shock, rheumatoid arthritis, chronic inflammatory arthritis, pancreatic cell damage from 
diabetes meUitus type 1, graft versus host disease, inflammatory bowel disease, 
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inflamation associated with pulmonary disease, other autoimmune disease or 
inflammatory disease, an antiproliferative agent such as for acute or chronic mylegenous 
leukemia or in the prevention of premature labor secondary to intrauterine infections. 

5 4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of 
a therapeutic that promotes or inhibits function of the polynucleotides and/or 
polypeptides of the inventioii. Such leukemias and related disorders include but are not 
limited to acute leukemia, acute lymphocytic leukemia, acute myelocytic leukemia, 
10 myeloblastic, promyelocytic, myelomonocytic, monocytic, erythroleukemia, chronic 

leukemia, chronic myelocytic (granulocytic) leukemia and chronic lymphocytic leukemia 
(for a review of such disorders, see Fishman et al., 1985, Medicine, 2d Ed., J.B, 
lippincott Co., Philadelphia). 

15 4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication 
of therapeutic utility, include but are not limited to nervous system injuries, and diseases 

20 or disorders which result in either a disconnection of axons, a diminution or degeneration 

of neurons, or demyelination. Nervous system lesions which may be treated in a patient 

(including human and non-human mammalian patients) according to the invention 

include but are not limited to the following lesions of either the central (including spinal 

cord, bndn) or peripheral nervous systems: 
25 (i) traumatic lesions, including lesions caused by physical injury or associated 

with surgery, for example, lesions which sever a portion of the nervous system, or 
compression injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous 
system results in neuronal injury or death, including cerebral infarction or ischemia, or 
30 spinal cord infarction or ischemia; 
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(iii) infectious lesions, in which a portion of the nervous system is destroyed or 
injured as a result of infection, for example, by an abscess or associated with infection by 
human immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme 
disease, tuberculosis, syphilis; 

5 (iv) degenerative lesions, in which a portion of the nervous system is destroyed 

or injured as a result of a degenerative process including but not limited to degeneration 
associated with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or 
amyotrophic lateral sclerosis; 

(v) lesions associated with nutritional diseases or disorders, in which a portion 
10 of the nervous system is destroyed or injured by a nutritional disorder or disorder of 

metabolism including but not limited to, vitamin B12 deficiency, folic acid deficiency, 
Wernicke disease, tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary 
degeneration of the corpus callosum), and alcoholic cerebellar degeneration; 

(vi) neurological lesions associated with systemic diseases including but not 
15 limited to diabetes (diabetic neuropathy. Bell's palsy), systemic lupus erythematosus, 

carcinoma, or sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 

neurotoxins; and 

(viii) demyelinated lesions in which a portion of the nervous system is 

20 destroyed or injured by a demyelinating disease including but not limited to multiple 
scleiosis, human inamunodeficiency virus-associated myelopathy, transv^e myelopathy 
or various etiologies, progressive multifocal leukoencephalopathy, and central pontine 
myelinolysis. 

Hieiapeutics which are useful according to the invention for treatment of a 
25 nervous system disorder may be selected by testing for biological activity in promoting 
the survival or differentiation of neurons. For example, and not by way of limitation, 
therapeutics which elicit any of tiie following effects may be useful according to the 
invention: 

(i) increased survival time of neurons in culture; 
30 (ii) increased sprouting of neurons in culture or in vivo; 
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(iii) increased production of a neuron-associated molecule in culture or in vivo^ 
€.g., choline acetyltransferase or acetylcholinesterase with inspect to motor neurons; or 

(iv) decreased symptoms of neuron dysfunction in vivo. 

Such effects may be measured by any method known in the ait. In preferred, 
5 non-limiting embodiments, increased survival of neurons may be measured by the 

method set forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting 
of neurons may be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 
70:65-82) or Brown et al. (1981, Ann. Rev. Neurosci. 4:17-42); increased production of 
neuron-associated molecules may be measured by bioassay, enzymatic assay, antibody 
10 binding, Northem blot assay, etc, depending on the molecule to be measured; and motor 
neuron dysfunction may be measured by assessing the physical manifestation of motor 
neuron disorder, e.g.^ wealoiess, motor neuron conduction velocity, or functional 
disability. 

In specific embodiments, motor neuron disorders that may be treated according to 
IS the invention include but are not limited to disorders such as infarction, infection, 

exposure to toxin, trauma, surgical damage, degenerative disease or malignancy that may 
affect motor neurons as well as other components of the nervous system, as well as 
disorders that selectively affect neurons such as amyotrophic lateral sclerosis, and 
including but not limited to progressive spinal muscular atrophy, progressive bulbar 
20 palsy, primary lateral sclerosis, infantile and juvenile muscular atrophy, progressive 
bulbar paralysis of childhood (Fazio-Londe syndrome), poliomyelitis and the post polio 
syndrome, and Hereditary Motorsensory Neuropathy (Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

25 A polypeptide of the invention may also exhibit one or more of the following 

additional activities or effects: inhibiting the growth, infection or function of, or killing, 
infectious agents, including, without limitation, bacteria, viruses, fungi and other 
parasites; effecting (suppressing or enhancing) bodily characteristics, including, without 
limitation, height, weight, hair color, eye color, skin, fat to lean ratio or other tissue 

30 pigmentation, or organ or body part size or shape (such as, for example, breast 

augmentation or diminution, change in bone form or shape); effecting biorhythms or 
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circadian cycles or rhythms; effecting the fertility of male or female subjects; effecting 
the metabolism, catabolism, anabolism, processing, utilization, storage or elimination of 
dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other nutritional 
factors or component(s); effecting behavioral characteristics, including, without 

5 limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 
(including depressive disorders) and violent behaviors; providing analgesic effects or 
other pain reducing effects; promoting differentiation and growth of embryonic stem cells 
in lineages other than hematopoietic lineages; hormonal or endocrine activity; in the case 
of enzymes, correcting deficiencies of the enzyme and treating deficiency-related 

10 diseases; treatment of hyperproliferative disorders (such as, for example, psoriasis); 
immunoglobulin-like activity (such as, for example, the ability to bind antigens or 
complement); and the abiUty to act as an antigen in a vaccine composition to raise an 
immune response against such protein or another material or entity which is 
cross-reactive with such protein. 

15 

4.10.19 IDENTIFICATION OF POLYMORPfflSMS 
The demonstration of polymorphisms makes possible the identification of such 
polymorphisms in human subjects and the pharmacogenetic use of this information for 
diagnosis and treatment. Such polymorphisms may be associated with, e.g., differential 

20 predisposition or susceptibility to various disease states (such as disorders involving 
inflammation or immune response) or a differential response to drug administration, and 
this genetic information can be used to tailor preventive or therapeutic treatment 
appropriately. Fbr example, the existence of a polymorphism associated with a 
predisposition to inflammation or autoimmune disease makes possible the diagnosis of 

25 this condition in humans by identifying the presence of tiie polymorphism. 

Polymorphisms can be identified in a variety of ways known in tiie art which all 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, 
optionally involving isolation or amplification of the DNA, and identifying tiie presence 
of the polymorphism in the DNA. For example, PGR may be used to amplify an 

30 appropriate fragment of genomic DNA which may Uien be sequenced. Alternatively, die 
DNA may be subjected to allele-specific oligonucleotide hybridization (in which 
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appropriate oligonucleotides are hybridized to the DNA under conditions permitting 
detection of a single base mismatch) or to a single nucleotide extension assay (in which 
an oligonucleotide that hybridizes inamediately adjacent to the position of the 
polymorphism is extended with one or more labeled nucleotides). In addition, traditional 

5 restriction fragment length polymorphism analysis (using restriction enzymes that 

provide differential digestion of the genomic DNA depending on the presence or absence 
of the polymorphism) may be performed. Arrays with nucleotide sequences of the 
present invention can be used to detect polymorphisms. The airay can comprise modified 
nucleotide sequences of the present invention in order to detect the nucleotide sequences 

10 of the present invention. In the alternative, any one of the nucleotide sequences of the 
present invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence 
could also be detected by detecting a corresponding change in amino acid sequence of the 
protein, e.g., by an antibody specific to the variant sequence. 

15 

4.10.20 ARTHRITIS AND INFLAMMATION 
The immunosuppressive effects of the compositions of the invention against 
rheumatoid arthritis is determined in an experimental animal model system. The 
experimental model system is adjuvant induced arthritis in rats, and the protocol is 

20 described by J. Holoshitz. et at., 1983. Science, 219:56, or by B. Waksman et al., 1963, 
ht. Arch. Allergy Appl. Immunol., 23:129. Induction of the disease can be caused by a 
single injection, generally intradermally, of a suspension of killed Mycobacterium 
tuberculosis in complete Fieund's adjuvant (CPA). The route of injection can vary, but 
rats may be injected at the base of the tail with an adjuvant mixture. The polypeptide is 

25 administered in phosphate buffered solution (PBS) at a dose of about 1-5 mg/kg. The 
control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of 
intradermally injecting killed Mycobacterium tuberculosis in CPA followed by 
immediately administering the test compound and subsequent treatment every other day 

30 until day 24. At 14, 15, 18, 20, 22, and 24 days after injection of Mycobacterium CPA, an 
overall arthritis score may be obtained as described by J. Holoskitz above. An analysis of 
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the data would reveal that the test compound would have a dramatic affect on the 
swelling of the joints as measuredby a decrease of the arthritis score. 



5 4 11 THERAPEUTIC METHODS 

THc compositions Oncluding polypeptide figments, analogs, variants and 
antibodies or other binding partneis or moddators including antisensepolynu^^^^^ 
of theinventionhavenumerousappUcationsinavarietyof therapeuticmethods. 

Examples of therapeutic appUcationsinclude.butaxenotlimitedto,th^^ 
10 herein. 

4.11.1 E3CAMPLE 

On= embodiment of fto tavontion is the .dminiaration of «. effecttve amount of 
me polypeptides or othe, composition of .he invention to individuals affected b, . 
15 diseaseordisorder,hs.canbemodula,edby,eg-l«inE.hepeptidesofti»inventi«. 

WUle the mode of administtation is not paticnlad, important. parenB«I admimstraBon 
is ptefened. An exemplary.^ of administiationis«> deliver an intravenousbo.™^ 

the dosage of me polypeptides or other composition of the invasion vriU noomlly be 
ae.ernnnedby.heprescdbingph,sician.Itis.obee,pec,ed.h«ti»aosage«iBv^^^ 

20 .oco,di.g.oti»age.».eight,oo™litio„a„dres^seofti«individua.p«ientTM-^^^ 
«„amoa.tofpo.,pepti*«.minis.e.edpcrdosewil.beinfl,emngeofabo«0.01W!*g 
to 106 mg*gof body «ei6b..«i.h the p«fenedd<«e being abou.0.1«*gtolOmB*g 
„,p.tie«body«eight For parenteral »iminis.,ation. polypeptides of ti,e invention «JI 
beft«.».l«edinaninjectiMefonno<»nbin=d«itt.apharmaceuticallya^^^^ 

05 parenterlvdtiole. S«hvdnoles«.weB known in the art and examples inclnde water 
saline.Ringeessol«ion.dex.n»e solution, and sotations consisting of small amo^ts of 

mehuman serum albumin. The vehide may contain mmor amounts of additives that 
main.ainO»isoti>nici.,»»ls«.bili.yofthepolypep.ideoroa«»=tiveing«d.ent. The 

preparation of such solutions is wilhh. «>e sBU of ttie art 

30 
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4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 

ADMINISTRATION 

A protein or other composition of the present invention (from whatever source 
derived, including without limitation from recombinant and non-recombinant sources and 
5 including antibodies and other binding partners of the polypeptides of the invention) may 
be administeied to a patient in need, by itself, or in pharmaceutical compositions where it 
is mixed with suitable carriers or excipient(s) at doses to treat or ameliorate a vanety of 
disorders. Such a composition may optionally contain (in addition to protein or other 
active ingredient and a carrier) diluents, fillers, salts, buffers, stabilizers, solubilizers. and 
10 other materials well known in the art. The term "pharmaceutically acceptable" means a 
non-toxic material that does not interfere with the effectiveness of the biological activity 
of the active ingredient(s). The characteristics of die carrier will depend on the route of 
administration. The pharmaceutical composition of tiie invention may also contain 
cytokines, lymphokihes. or other hematopoietic factors such as M-CSF. GM-CSF. TNF. 
15 IL-1, IL-2, IL-3, IL-4, IL-5, IL-6. IL-7, IL-8, IL-9, IL-10, IL-11. IL-12, IL-13, JL-U, 
JL-15, TEN. TNFO, TNFl , TNF2, G-CSF, Meg-CSF, tiirombopoietin. stem ceU factor, 
and e^poietin. In further compositions, proteins of die invention may be combined 
with other agents beneficial to die treatment of die disease or disorder in question. These 
agents mclude various growtii factors such as epidermal growtfi factor (EOF), 
20 platelet^rived growth factor (PDGF). transforming growtii factors (TGF-a and TGF-P). 
insulin-like growth factor aOF). as weU as cytokines described herein. 

The pharmaceutical composition may further contain otiier agents which eitiier 
enhance ti« activity of tiie protein or other active ingredient or complement its activity or 
use in treatinent. Such additional factors and/or agents may be included in the 
25 pharmaceutical composition to produce a synergistic effect witii protdn or otiier active 
ingredient of die invention, or to minimize side effects. Conversely, protein or otiier 
active ingredient of tfie present invention may be included in formulations of die 
particular clotting factor, cytokine, lymphokine, other hematopdetic factor, tim>mbolytic 
or anti-tiirombotic factor, or anti- inflammatory agent to minimize side effects of die 
30 clotting factor, cytokine, lymphokine. otiier hematopoietic factor, tiirombolytic or 

anti-flirombotic factor, or anti-inflammatory agent (such as IHRa. lL-1 Hyl. IL-1 Hy2. 
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anti-TNF, corticosteroids, immunosuppressive agents). A protein of the present 
invention may be active in multimers (e.g., heterodimers or homodimers) or complexes 
with itself or other proteins. As a result, pharmaceutical compositions of the invention 
may comprise a protein of the invention in such multimeric or complexed form. 

As an alternative to being included in a phannaceutical composition of the 
invention including a first protein, a second protein or a therapeutic agent may be 
concurrently administered with the first protein (e.g., at the same time, or at differing 
tunes provided that therapeutic concentrations of the combination of agents is achieved at 
the treatment site). Techniques for formulation and administration of the compounds of 
the instant application may be found in "Remington's Pharmaceutical Sciences," Mack 
Publishing Co., Easton, PA, latest edition, A therapeutically effective dose further refers 
to that amount of the compound sufficient to result in amelioration of symptoms, e.g., 
treatment, healing, prevention or amelioration of the relevant medical condition, or an 
increase in rate of treatment, healing, prevention or amelioration of such conditions. 
When applied to an individual active ingredient, administered alone, a therapeutically 
effective dose refers to that ingredient alone. When applied to a combination, a 
therapeutically effective dose refers to combined amounts of the active ingredients that 
result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

In practicing the method of treatment or use of the present invention, a 
therapeutically effective amount of protein or other active ingredient of the present 
invention is administered to a mammal having a condition to be treated. Protein or other 
active ingredient of the present invention may be administered in accordance with the 
method of the invention either alone or in combination with other therapies such as 
treatments employing cytokines, lymphokines or other hematopoietic factors. When co- 
administered with one or more cytokines, lymphokines or other hematopoietic factors, 
protein or other active ingredient of the present invention may be administered either 
simultaneously with the cytokine(s), lymphokine(s), other hematopoietic factor(s). 
thrombolytic or anti-thrombotic factors, or sequentially. If administered sequentially, the 
attending physician will decide on the appropriate sequence of administering protein or 
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Other active ingredient of the present invention in combination with cytokine(s), 
lynq)hokine(s), other hematopoietic factor(s), Umjmbolytic or anti-thrombotic factors. 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, 
transmucosal, or intestinal administration; parenteral delivay, including intramuscular, 
subcutaneous, intramedullary injections, as well as intrathecal, direct intraventricular, 
intravenous, intraperitoneal, intranasal, or intraocular injectitms. Administration of 
protein or other active ingredient of the present invention used in the pharmaceutical 
composition or to practice the method of the present invention can be carried out in a 
variety of conventional ways, such as oral ingestion, inhalation, topical application or 
cutaneous, subcutaneous, intraperitoneal, paraiteral or intravenous injection, iitravenous 
administration to the patient is preferred. 

Alternately, one may administer the compound in a local rather than systemic 
manner, for example, via injection of the compound direcfly into a artiuitic joints or in 
jBbrotic tissue, often in a depot or sustained release fonnulation. In order to prevent flie 
scarring process frequently occurring as complication of glaucoma surgery, the 
compounds may be administered topically, for example, as eye drops. Furthermore, one 
may administer the dmg in a targeted drug delivery system, for example, in a liposome 
coated with a specific antibody, targeting, for example, arthritic or fibrotic tissue. The 
liposomes wUl be targeted to and taken up selectively by the afflicted tissue. 

The polypeptides of the invention are administered by any route that delivas an 
effective dosage to tiie desired site of action. The determination of a suitable route of 
administration and an effective dosage for a particular indication is within tiie level of 
skill in the art. Preferably for wound tijeatmant, one administers tiie therapeutic 
compound direcfly to the site. Suitable dosage ranges for tiie polypeptides of the 
invention can be exti^polated from these dosages or from similar sttidies in appropriate 
animal models. Dosages can then be adjusted as necessary by flie cUnician to provide 
fnayimal therapeutic benefit. 

4.12.2 COMPOSITIONS/FORMULATIONS 
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Pharmaceutical compositions for use in. accordance with the present invention 
thus may be formulated in a conventional manner using one or more physiologically 
acceptable carriers comprising excipients and auTuIiaries which facilitate processing of 
the active compounds into preparations which can be used pharmaceuticaUy. Hiese 

5 pharmaceutical compositions may be manufactured in a manner that is itself known, e.g., 
by means of conventional mixing, dissolving, granulating, dragee-making, levigating, 
emulsifying, encapsulating, entrapping or lyophilizing processes. Proper fonnulation is 
dependent upon the route of administration chosen. When a therapeutically effective 
amount of protein or other active ingredient of the present invention is administered 

10 orally, protein or other active ingredient of the present invention will be in the form of a 
tablet, capsule, powder, solution or elixir. When administered in tablet form, the 
phannaceutical composition of the invention may additionally contain a solid carrier such 
as a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% 
protein or other active ingredient of the present invention, and preferably from about 25 

15 to 90% protein or other active ingredient of the present invention. When administered in 
liquid form, a liquid carrier such as water, petroleum, oils of animal or plant origin such 
as peanut oil, mineral oil, soybean oil, or sesame oil, or synthetic oils may be added. The 
liquid form of the pharaiaceutical composition may further contain physiological saline 
solution, dextrose or other saccharide solution, or glycols such as ethylene glycol, 

20 propylene glycol or polyethylene glycol. When administered in liquid form, the 

pharmaceutical composition contains from about 0.5 to 90% by weight of protein or other 
active ingredient of the present invention, and preferably from about 1 to 50% protein or 
other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of 

25 the present invention is administered by intravenous, cutaneous or subcutaneous 

injection, protein or other active ingredient of the present invention will be in the form of 
a pyrogen-free, parenterally acceptable aqueous solution. The preparation of such 
parenterally acceptable protein or other active ingredient solutions, having due regard to 
pH, isotonicity, stability, and the like, is within tiie skill in the art. A preferred 

30 pharmaceutical composition for intravenous, cutaneous, or subcutaneous injection should 
contain, in addition to protein or other active ingredient of the present invention, an 
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isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, Dextrose 
Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or other 
vehicle as known in the art. The pharmaceutical composition of the present invention 
may also contain stabilizers, preservatives, buffers, antioxidants, or other additives 
5 known to those of skill in the art. For injection, the agents of the invention may be 

formulated in aqueous solutions, preferably in physiologically compatible buffers such as 
Hanks's solution, Ringer's solution, or physiological saline buffer. For transmucosal 
administration, penetrants appropriate to the barrier to be permeated are used in the 
formulation. Such penetrants are generally known in the art. 

10 For oral adn^nistration, the compounds can be formulated readily by combining 

the active compounds with pharmaceutically acceptable carriers well known in the art. 
Such carriers enable the compounds of the invention to be formulated as tablets, pills, • 
dragees, capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral 
ingestion by a patient to be treated. Pharmaceutical preparations for oral use can be 

15 obtained from a solid excipient, optionally grinding a resulting mixture, and processing 
the mixture of granules, after adding suitable auxiliaries, if desired, to obtain tablets or 
dragee cores. Suitable excipients are, in particular, fillers such as sugars, including 
lactose, sucrose, mannitol, or sorbitol; cellulose preparations such as, for example, maize 
starch, wheat starch, rice starch, potato starch, gelatin, gum tragacanth, methyl ceDulose, 

20 hydroxypropylmethyl-cellulose, sodium carboxymethylcellulose, and/or 

polyvinylpyrrolidone (PVP). If desired, disintegrating agents may be added, such as the 
cross-linked polyvinyl pyrrolidone, agar, ch* alginic acid or a salt thereof such as sodium 
alginate. Dragee cores are provided with suitable coatings. F6r this purpose, 
concentrated sugar solutions maybe used, which may optionally contain gum arabic, talc, 

25 polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may 
be added to the tablets or dragee coatings for identification or to characterize different 
combinations of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules 

30 made of gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as 
glycerol or sorbitol. The push-fit capsules can contain the active ingredients in admixture 
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with filler such as lactose, binders such as starches, and/or lubricants such as talc or 
magnesium stearate and, optionally, stabilizers. In soft capsules, the active compounds 
may be dissolved or suspended in suitable liquids, such as fatty oils, liquid paraffin, or 
liquid polyethylene glycols. In addition, stabilizers may be added. All fonnulations for 

5 oral administration should be in dosages suitable for such administration. For buccal 
administration, the compositions may take the fonn of tablets or lozenges formulated in 
conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 

10 pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 

dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon 
dioxide or other suitable gas. In the case of a pressurized aerosol the dosage unit may be 
detennined by providing a valve to dehver a metered amount. Capsules and cartridges 
of, gelatin for use in an inhaler or insufflator may be formulated containing a powder 

15 mix of the compound and a suitable powder base such as lactose or starch. The 

compounds may be formulated for parenteral administration by injection, e.g., by bolus 
injection or continuous infusion. Formulations for injection may be presented in unit 
dosage fonn, e.g., in ampules or in muW-dose containers, with an added preservative. 
The compositions may take such forms as suspensions, solutions or emulsions in oily or 

20 aqueous vehicles, and may contain formulatoiy agents such as suspending, stabilizing 
and/or dispersing agents. 

Pharmaceutical fonnulations for parenteral administration include aqueous 
solutions of the active compounds in water-soluble form. Additionally, suspensions of 
the active compounds may be prepared as appropriate oily injection suspensions. 

25 Suitable lipophilic solvents or vehicles include fatty oils such as sesame oil, or synthetic 
fatty acid esters, such as ethyl oleate or triglycerides, or liposomes. Aqueous injection 
suspensions may contain substances which increase the viscosity of the suspension, such 
as sodium carboxymethyl cellulose, sorbitol, or dextran. Optionally, the suspension may 
also contain suitable stabilizers or agents which increase the solubility of the compounds 

30 to allow for the preparation of highly concentrated solutions. Alternatively, the active 
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ingredient may be in powder form for constitution with a suitable vehicle, e.g., sterile 
pyrogen-frce water, before use. 

The compounds may also be formulated in rectal compositions such as 
suppositories or retention enemas, e.g., containing conventional suppository bases such as 

5 cocoa butter or other glycerides. In addition to the formulations described previously, the 
compounds may also be fonnulated as a depot preparation. Such long acting 
formulations may be administered by implantation (for example subcutaneously or 
intramuscularly) or by intramuscular injection. Thus, for example, the compounds may 
be formulated with suitable polymeric or hydrophobic materials (for example as an 

10 emulsion in an acceptable oil) or ion exchange resins, or as sparingly soluble derivatives, 
for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co- 
solvent system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible 
organic polymer, and an aqueous phase. The co-solvent system may be the VPD 

15 co-solvent system. VPD is a solution of 3% w/v benzyl alcohol, 8% w/v of the nonpolar 
surfactant polysorbate 80, and 65% w/v polyethylene glycol 300, made up to volume in 
absolute ethanol. The VPD co-solvent system (VPD:5W) consists of VPD diluted 1:1 
vdth a 5% dextrose in water solution. This co-solvent system dissolves hydrophobic 
compounds weU, and itself produces low toxicity upon systemic administration. 

20 Naturally, the proportions of a co-solvent system may be varied considerably without 
destroying its solubility and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be varied: for example, other low-toxicity nonpolar 
surfactants may be used instead of polysorbate 80; the fraction size of polyethylene 
glycol may be varied; other biocompatible polymers may replace polyethylene glycol, 

25 e.g. polyvinyl pyirolidone; and other sugars or polysaccharides may substitute for 
dextrose. Alternatively, otiier delivery systems for hydrophobic pharmaceutical 
compounds may be employed, liposomes and emulsions are well known examples of 
delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents such as 
dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 

30 Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the tiierapeutic agent. 
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Various types of sustained-release materials have bech established and are well known by 
those skilled in the art. Sustained-release capsules may, depending on their chemical 
nature, release the compounds for a few weeks up to over 100 days. Depending on the 
chemical nature and the biological stability of the therapeutic reagent, additional 
S strategies for protein or other active ingredient stabilization may be employed. 

The pharmaceutical compositions also may comprise suitable solid or gel phase 
carriers or excipients. Examples of such carriers or excipients include but are not limited 
to calcium carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, 
gelatin, and polymers such as polyethylene glycols. Many of the active ingredients of the 

10 invention may be provided as salts with pharmaceutically compatible counter ions. Such 
pharmaceutically acceptable base addition salts are those salts which retain the biological 
effectiveness and properties of the free acids and which are obtained by reaction with 
inorganic or organic bases such as sodium hydroxide, magnesium hydroxide, ammonia, 
trialkylamine, dialkylamine, monoalkylamine, dibasic amino acids, sodium acetate, 

15 potassium benzoate, triethanol amine and the like. 

The pharmaceutical composition of the invention may be in the form of a 
complex of the protein(s) or other active ingredient(s) of present invention along with 
protein or peptide antigens. The protein and/or peptide antigen will deliver a stimulatory 
signal to both B and T lymphocytes. B lymphocytes will respond to antigen through their 

20 surface immunoglobulin receptor. T lymphocytes will respond to antigen through the T 
cell receptor (TCR) following presentation of the antigen by MHC proteins. MHC and 
stmcturally related proteins including those encoded by class I and class n MHC genes 
on host cells will serve to present the peptide antigenCs) to T lymphocytes. The antigen 
components could also be supplied as purified MHC-peptide complexes alone or with 

25 co-stimulatory molecules that can direcdy signal T cells. Alternatively antibodies able to 
bind surface immunoglobulin and other molecules on B cells as well as antibodies able to 
bind the TCR and otiier molecules on T cells can be combined widi the pharmaceutical 
composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a 

30 liposome in which protein of the present invention is combined, in addition to other 

pharmaceutically acceptable carriers, with amphipathic agents such as lipids which exist 
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in aggregated form as micelles, insoluble monolayers, liquid crystals, or lamellar layers 
in aqueous solution. Suitable Upids for liposomal formulation include, without hmitation, 
monoglycerides, diglycerides, sulfatides, lysolecithins, phosphoUpids, saponm, bde acids. 
and the like. Preparation of such Uposomalfomiulations is within dielevel of skillmthe 

ar, as disclosed, for example, inU.S.PatentNos.4.235,871;4,501.728;4^^^^^^^ 

4,737323, all of which are incorporated hradn by reference. 

' -nieamountofproteinorotheractiveingredientofthepiesentinventioninthe 

pharmaceutical composition of the present invention will dependuponfhe nature and 
severity of the conditionbeing treated, and on the nature of prior treatments which the 

. patient has undergone. Ultimately, die attendmg physician will decide .he amount of 
protein or odier active ingredient of the present invention with which to treat each 
individual patient. Initially, the attending physician will administer low doses of piotem 
or other acQve ingredient of the present invention and observe the patient's response. 
Larger doses of protein or other active ingredient of the present invention may be 
5 administered until the optimal therapeutic effect is obtained for the patient, andatthat 
pdnt tiie dosage is not increased further. It is contemplated that the vanous 
pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 ,xg to about 100 mg (preferably about 0.1 pg to about 10 mg. more 
preferably about 0.1 ^ig to about 1 mg) of protein or other active ingredient of the present 
0 inventionperkgbodyweight. For compositions of tiie present invention wtach are 
useful for bone, cartilage, tendon or Ugament regeneration, die tiierapeutic method 
includes administeringthecomposition topically, systematically, or locally as an implant 
or device When administered, the therapeutic composition for use in this invention is. of 
course, in a pjoogen-fme. physiologically acceptable form. Further, tiie composition may 
25 desirablybeencapsulatedorinjectedinaviscousformfordeliverytothesrteofbone, 
cartilage or tissue damage. Topical administration may be suitable for wound heahng 
and tissue repair. Therapeutically useful agents other than a protein or other active 
ingredient of the invention which may also optionally be included in die composition as 
described above, may alternatively or additionaUy. be administered simultaneously or 
30 sequentiallywithtiiecompositionintiiemethodsoftheinvention. Preferably for bone 
and/or cartilage formation, tiie composition wouldincludeamatiix capable of delivenng 
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the protein-containing or other active ingredient-containing composition to the site of 
bone and/or cartilage damage, providing a structure for the developing bone and cartilage 
and optimally capable of being resorbed into the body. Such matrices may be formed of 
materials presently in use for other implanted medical applications. 
5 The choice of matrix material is based on biocompatibility, biodegradability, 

mechanical properties, cosmetic appearance and interface properties. The particular 
application of the compositions will define the appropriate formulation. Potential 
matrices for the con^ositions may be biodegradable and chemically defined calcium 
sulfate, tricalcium phosphate, hydroxyapatite, polylactic acid, polyglycolic acid and 

10 polyanhydrides. Other potential materials are biodegradable and biologically 

well-defined, such as bone or dermal collagen. Further matrices are comprised of pure 
proteins or extracellular matrix components. Other potential matrices are 
nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the 

15 above mentioned types of material, such as polylactic acid and hydroxyapatite or 

collagen and tricalcium phosphate. The bioceramics may be altered in composition, such 
as in calcium-aluminate-phosphate and processing to alter pore size, particle size, particle 
shape, and biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of 
lactic acid and glycolic acid in the form of porous particles having diameters ranging 

20 from 150 to 800 microns. Jn some applications, it will be useful to utilize a sequestering 
agent, such as carboxymethyl cellulose or autologous blood clot, to prevent the protein 
compositions firom disassociating from the matrix. 

A preferred family of sequestering agents is cellulosic materials such as 
alkylcelluloses (including hydroxyalkylcelluloses), including methylcellulose, 

25 ethylcellulose, hydroxyethylcellulose, hydroxypropylcellulose. 

hydroxypropyl-methylcellulose, and carboxymethylcellulose, the most preferred being 
cationic salts of carboxymethylcellulose (CMC). Other preferred sequestering agents 
include hyaluronic acid, sodium alginate, poly(ethylene glycol), polyoxyethylene oxide, 
carboxyvinyl polymer and poly(vinyl alcohol). The amount of sequestering agent useful 

30 herein is 0.5-20 wt %, preferably 1-10 wt % based on total formulation weight, which 
represents the amount necessary to prevent desorption of the protein from the polymer 
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^,.^«,^vide.pp«.prt«eha„d>i„,of..eco.posiUo»,,e.,K«so»»^^ 
p„^»,cc..s«ep«vea«dfto»inffl,»a„g*ema.ri.,««^yFOV.d.ng*«pr^^ 

m topeutic con.posW»ns «e also p«»««ly val»abte for vaenna^. 
app«cado.s. Particularly do»sttc.«™U»d«K««gKtaedho««>««tto^^ 

Hlans.a«^sWpa««..teaucH»a»n^»i*P«>«i»-««-»«'-^^ 
of .hepre^m invention. The dosage ,egin>»<rfap««cin^«i.i»gI««nMC^ 
oo^positotobeusedindssueregeneradonwalbede^nninedbytoc attending 

pKysicia„consideri„gva.icU3,a«o..«Mch™x«fy*e«a<».^*eP»^-'-^. 
; J«,n.o,..ss„eweigK.desi^.obefonned,*esi«ofda»ge.«»o«^t.ono^^^ 
a^ttssne.*esizeota»ound,W«ofdamag«l«ss<»(.«.,l»ne).<heP«.-'a 
rianddie,.heseve.i.yofanyinf.cUon,Umeo,»lninis««™-«lofterctau«^^ 

rr'.,ied«.ge..a,varywi.,He.y^o,n.a.n.nsedind.e»co.s«n«io^^^ 
W„slonofoa.erpro«insi„a,ephannace«ica.co.nposiUon.Forexan,,.^ 
„ „,oai=rtaowngrov«hfac».s,si.chas.OFI(insnU„M.epow,hfac«I),»J»fin.l 
c»mpod«on,n»y-soe«^.U«do..g..P»6.esscanbcmo,u.o«dbypenod.c 

Jl^„,.issue«»neg««*».a,«.ep.ir,forexa.npfe^ 

determinations and tetracycline labeling. 

M,nucta«idcso,d.ep,«e«inven«„.oan.Uote„sedforgenedie«py.S»cb 

^ po.,n„c.eo«descanb.i,«roducedeime,invivoor..^voin.oe,,sforex^^^ 
LLua.subjectP*.yn«leoHdesofd«inve,,.on,n.y..sobead.n.m^ 

„^«U for in^iducaon of n»d«c acid in» a cen or «g.nism (including. 
^*„„,lin,iUd»,.in«»forino(viralvec».orna.edDNA).CeUsn»yals^^^^^ 
cultu,edexvivoin.hep,ese.Keofpro«insof^epresen.inven«o„.no.der.o^Me^ 
30 „r.oproduceadesiredeBec.o.<.acavi.yins„d,oens.-.^cemoan«.enbo 

introduced in vivo (or therapeuttc purposes. 
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4J12J EFFECTIVE DOSAGE 

Pk^«icd compositto.s suiBble for use in *e present inv«.tio. 

Forex.n,pte..6osccanbef<»mul«edin»«nKxfel.»«chi.veac«ul.»ng 
c<„cen,«S™rang.U.«c»,b.»=d,o,«»«=cur..=>yd=»-i"'-^^ 

pro-em. biCogicalacUvHy). S„eh WonBa.o.canbe.ed«,mo.accu«».yde«™«e 

useful doses in humans. jtu„t«.cniK 
A.herapeuUoaIlyeffecUvedose»fe«»ft«»o».<*'hecon^<ta.-es»l« 

m.»»UoraUonofsyn,p,oo««apr„,o„saa«.of^va.i«.pad2T«^^ 
«, fl«r.p«.«cemcac,of.uchcon,po„„a.ca„beae»n.tocd*,«a«^ 
pjta«in cel, cul»«s or experi^nua »*nel.. 

L.efl^U.50%o,mepopu.«ion)a„aU,eED„(a,edo«fl«pe»«ed.,e«e»ve 

50%<>f*epop«Wio.). mdo.raUobe^een.cxicandfte.apeotioe^^*^ 
a^apeotic into ar-d i. c». be expressed as *e ratio be«.=enU)» andj^^ 
« Con^unds«bich«hibi.hish«.erapeu.ci„dieesarep.efer,ed. "nre da. c«»«d 6- 
Loe«c»..u.eassaysaod»..n-s«csc»,be„sedin— nsarangeofdosage 

,„r,«inhum». The dosage of such compounds lies preferably »i*in a ™.ge of 
drcul«i,.gconcen»donsS».inc.„de*eED„wi*U«leorno,oxici.yThedosage 

LyvaryLrMu-^-i^-^^^-P-'^--^'"-™^'"^''""'^ 0^ 
30 adir»Uo„«iH«<^ Tbeexac.f»»a«i».»».eo,ad„.— a-^dos^jj^ 

chosenbyd,eindividualphyaid«,i,>viewof*ep«««'sc«nd.uon.See.....R«g.« 
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al 1975 in"ThePhannacologicalBasisofTherapeutics".Ch.lp.l. Dosage amount 
Jdinte^aln^ybeadjustedindividuallytoprovideplasmalevelso^ 
,vhich are sufficient to maintain Ac desired effects, or minimal effective concentration 
(MEC) n^e MEC will vary for each compound but can be estimated from in vUro data. 
5 Dosagesnecessaryto.achievetheMECwilldependonindividualcharacteristicsand 

„,ute of administration. However. HPLC assays or bioassays can be used to determine 

plasma concentrations. ' 

Dosage intervals can also be determined using MEC value. Compounds should 
be administeredusing aregimen which maintains plasmalevels above theMEC for 
10 I(.90%ofthetime.preferablybetween30.90%andmostpreferablybetween50.^^^^^ 
I„casesoflocaladministrationorselectiveuptake,theeffectivelocalconcent^^^ 

the drug may not be related to plasma concentration. 

An exemplary dosage regimen for polypeptides or other compositions of the 
. invention will bein the range of about 0.01 ,.g/kg to 100 mg/kg of body weight dady, 
witii the preferred dose being about0.1^.gAcg to 25 mg«.g of patientbodyw^ghtdady, 

varying in adults and children. Dosing may be once daily, or equivalent doses may be 
delivered at longer or shorter intervals. 

The amount of composition administered will, of course, be dependent on the 
subject being treated, on tiie subject's age and weight, the severity of tiie affliction, the 
20 mamierofadministrationandthejudgmentoftheprescribingphysician. 

4.12.4 PACKAGING 

The compositions may. if desired, be presented in a pack or dispenser device 
which may contain one or more unit dosag. forms containing die active ingredient. The 
25 paclcmay.forexample,comprisemetaIorplasticfoil,suchasabUsterpack. Thepackor 
dispenserdevicemaybeaccompaniedbyinstructionsforadministration. Compositions 
comprisingacompoundofthemventionformulatedinacompatible pharmaceutical 

carrier may also be prepared, placed in an appropriate container, and labeled for 
treattnent of an indicated condition. 

30 

4.13 ANTIBODIES 
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Also included in the invention are antibodies to proteins, or fragments of proteins 
of the invention. The term "antibody" as used herein refers to immunoglobulin molecules 
and immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules 
that contain an antigen-binding site that specifically binds (immunoreacts with) an 
5 antigen. Such antibodies include, but are not limited to, polyclonal, monoclonal, 
chimeric, single chain. Fab, Fab* and F(ab')2 fragments, and an Fab expression library. In 
general, an antibody molecule obtained from humans relates to any of the classes IgG, 
IgM, IgA, IgE and IgD, which differ from one another by the nature of the heavy chain 
present in the molecule. Certain classes have subclasses as well, such as IgGi, IgG2, and 

10 others. Furthermore, in humans, the light chain may be a kappa chain or a lambda chain. 
Reference herein to antibodies includes a reference to all such classes, subclasses and 
types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an 
antigen, or a portion or fragment thereof, and additionally can be used as an immunogen 

15 to generate antibodies that immunospecifically bind the antigen, using standard 

techniques for polyclonal and monoclonal antibody preparation. The full-length protein 
can be used or, alternatively, the invention provides antigenic peptide fragments of the 
antigen for use as immunogens. An antigenic peptide fragment comprises at least 6 
amino acid residues of the amino acid sequence of the full length protein, such as an 

20 amino acid sequence shown in SEQ ID NO: 1-438, and encompasses an epitope thereof 
such that an antibody raised against the peptide forms a specific immune complex with 
the full length protein or with any fragment that contains the epitope. Preferably, the 
antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino acid 
residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 

25 epitopes encompassed by the antigenic peptide are regioiis of the protein that are located 
on its surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of alpha-2-macroglobulin-like protein that is located on the 
surface of the protein, e.g., a hydrophilic region. A hydrophobicity analysis of the human 

30 related protein sequence will indicate which regions of a related protein are particularly 
hydrophilic and, therefore, are likely to encode surface residues useful for targeting 
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antibody production. As a means for targeting antibody production, hydropathy plots 
showing regions of hydrophilicity and hydrophobicity may be generated by any method 
well known in the art, including, for example, the Kyte Doolittle or the Hopp Woods 
mettiods, either with or without Fourier transformation. See, e.g., Hopp and Woods, 
1981, Proc. Nat. Acad. Sd. USA 78: 3824-3828; Kyte and Doolittle 1982, /. MoL Biol 
157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or 
derivatives, fragments, analogs or homologs thereof, are also provided herein, 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

The term "specific for" indicates that the variable regions of the antibodies of the 
invention recognize and bind polypeptides of the invention exclusively (z.c., able to 
distinguish the polypeptide of the invention from other similar polypeptides despite 
sequence identity, homology, or similarity found in die family of polypeptides), but may 
also interact with other proteins (for example, S. aureus protein A or other antibodies in 
EUSA techniques) through interactions with sequences outside the variable region of the 
antibodies, and in particular, in the constant region of the molecule. Screening assays to 
determine binding specificity of an antibody of the invention are well known and 
routinely practiced in the art. For a comprehensive discussion of such assays, see Harlow 
et al. (Eds), Antibodies A Laboratory Manual; Cold Spring Harbor Laboratory; Cold 
Spring Harbor, NY (1988), Chapter 6. Antibodies that recognize and bind fragments of 
the polypeptides of the invention are also contemplated, provided that the antibodies are 
first and foremost specific for, as defined above, full-length polypeptides of the 
invention. As with antibodies that are specific for full length polypeptides of the 
invention, antibodies of the invention that recognize fragments are those which can 
distinguish polypeptides from the same family of polypeptides despite inherent sequence 
identity, homology, or similarity found in the family of proteins. 

Antibodies of the invention are useful for, for example, therapeutic purposes (by 
modulating activity of a polypeptide of the invention), diagnostic purposes to detect or 
quantitate a polypeptide of the invention, as well as purification of a polypeptide of the 
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invention. Kits comprising an antibody of the invention for any of the purposes 
described herein are also comprehended. In general, a kit of the invention also includes a 
control antigen for which the antibody is immunospecific. The invention further provides 
a hybridoma that produces an antibody according to the invention. Antibodies of the 
5 invention are useful for detection and/or purification of the polypeptides of tiie invention. 
Monoclonal antibodies binding to the protein of the invention may be useful 
diagnostic agents for the immunodetection of the protein. Neutralizing monoclonal 
antibodies binding to the protein may also be useful therapeutics for both conditions 
associated with the protein and also in the treatment of some forms of cancer where 

10 abnormal expression of the protein is involved. In the case of cancerous cells or 

leukemic cells, neutralizing monoclonal antibodies against the protein may be useful in 
detecting and preventing the metastatic spread of the cancerous cells, which may be 
mediated by the protein. 

The labeled antibodies of the present invention can be used for in vitro, in vivo, 

15 and in situ assays to identify cells or tissues in which a fragment of the polypeptide of 
interest is expressed. The antibodies may also be used directiy in therapies or other 
diagnostics. The present invention further provides the above-described antibodies 
immobilized on a solid support. Examples of such solid supports include plastics such as 
polycarbonate, complex carbohydrates such as agarose and Sepharose®, acrylic resins 

20 and such as polyacrylamide and latex beads. Techniques for coupling antibodies to such 
solid supports are well known in the art (Weir, D.M. et al., •Handbook of Experimental 
finmunology" 4th Ed, Blackwell Scientific Publications, Oxford, England, Chapter 10 
(1986); Jacoby, W J), et al., Meth. Enzym. 34 Academic Press, N.Y, (1974)). The 
immobilized antibodies of the present invention can be used for in vitro, invivo, and in 

25 situ assays as well as for immuno-affinity purification of the proteins of the present 
invention. 

Various procedures known within the art may be used for the production of 
polyclonal or monoclonal antibodies directed against a protein of the invention, or against 
derivatives, fragments, analogs homologs or orthologs thereof (see, for example, 
30 Antibodies: A Laboratory Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor 
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I^„r«o,,Prc«.ColdSp™gH^.NY.inoo.pa.a«dhe«inby«fcr™=e). So™.of 
these antibodies are discussed below. 

4 13.1 rOLYCLONAL ANTIBODIES 

5 For*»pK»l»«io«ofpol,oIondantibodies>alio«s»»i.ablehostamn^sCe.|., 

^bi. mo.»e«o*er»a.™d)»a,beinu»»i«dbyo« o, .no^ injecdons «.* 
«„™dvepc«ein,.sy»fteScv»i».d»«o..or.denv«iveo(«>cfo»goin6. An 
™pH.,ein»»nogenicp«p^onc»con»in.teex.»ple.«».a««Ilyoc^^^ 

i™„„„ogenicp^in..chenncall,s,nfl«sizedpolOT«*>«P«-'*«'^^ ^ 
.0 i™„unogeMcpro»i„,ora«oon*in-«l,exp«ss.*in--ornio,»^ 
O,ep,o»inmaybeconiuga<ed«,.secondpn,.*to.own»beinm»n»g«^^ 
J„.«nginun>™ized. Examplesof snchin»«,og«iop««e.ns»C »deb«.»«^ 

„^,ed.o keyhole .in,pe.hen,ocyanin.se™naIbu»nn, bovine «,,n>gl*«b^»^ 
s„vb.anuypsininhibitor.Thcp«paraUcnc»i.«her>nclude«i«ijuv«... Vanous 
,5 .dj»van.sused.oinc„a..hei™n.unologicalr=sponsei„cl«te.but«.no.li»^ 

F^^s (oon,ple.e and incompW. mineral gels (e g., alnndnmn h,d»x:de) surfa«. 
.ed,esnbs«ces(e.g.,lysoleciU*,,p.«ronicpolyols,polyanio.s.pep«d^^ 

aiBi^phenol, e«.), adjuvan. .able in hun^s such as Bacille CataeO^n and 
C<,,^™p,,vun.o.sin.larinn„u„„stin,u.a«„y.gen.. Addibona. e«npUs «t 
20 adiu,».«fta.c«.beemplo,edincludeMPHl>Madiuvan.(monophospho,ylUp«lA. 

syBlhetictKlMlosedicoiynoinycolatt). 

Tie poWonal andbody molecules di«cted ag^nst .he innnunogcmc proBm c«. 

techniques, such .s.fflni.yobrom«og«,*y«»i.SP»»i»Aorpro»mG,^ 

25 pHj,*c.gO««cao„«f'— Subse,uenay,oral«n,a«vely,*especmc 
Lgenwhichisd«.«g« of .heinnnunogl^ sought. oranepi.opeU.e„otmay be 

i^ili^on.colu.nntopud,,d»inmunespeci8c.ntibod,byi,nMun««^^ 

ch«,ma,ography. PuHAcation of i»mt»»8>ob«B- '"^-f^^ 
Wimnson(T1»Sciendst,i»blishedbyTl»Sden,is.lnc..Ph.la*IptaaPA.Vol.l4.No. 

30 8 (April 17,2000), pp. 25-28). 
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4.13.2 MONOCLONAL ANTIBODIES 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition . 
asusedherein.referstoapopulationofantibodymoleculesthatcontainonlyone 

n^olecularspedesofantibodymolecdeconsistingofauniqueUght chain genepr^^^^ 
5 anda«niqueheavychaingeneprodnct.Inparticular,thecomplementaritydete^^^^^ 
«gions(a)Rs)ofthemonoclonalantibodyareidenticalinaUthemolecdesoft^^ 
population. MAbsthuscontainanantigen-bindingsitecapableofimmunoreactin^ 
- particularepitopeoftheantigencharacterizedbyauniquebindingaffimtyfbrrt. 

Monoclonalantibodiescanbepreparedusinghybridomamethods.suchasthose 

10 describedbyKoWerandMilstem.Natm^256:495(lTO^ ,^ahybridomamethod.a 
n.ouse.hamster,or other appropriate host animal,is typically immunized With an 

inununizing agent to elxcitlymphocytes that produceor are capableofprod^^^^^ 
antibodies that will specifically bind to the immunizing agent Alternatively, the 
, lymphocytes can be immunized in vitro. 
,5 Th. immunizing agent wiU .ypically include fte po«n ' 

teeoforatn^ionprceintt^f. Generally, eitepenphe«. Mood l,mphocy«s». 
^if cells of h™>«n origin a^desi^d. or spleen cem or lyn^nodecelta»..^tf 

i,,,™^cdlline>^6asai,able fusing agen,. such as polyeUryleoeS^^^ 

M ahybridom.ceU(Oodh.g.Mon^lon.l ^„^,^^:m^m^>iS^^ 
P,c«.(1986)pp.5W(B).Immor»lizedceUlinesa..asuaUy™sformedm.n»mUan 

cdte,'pa«ic«U.ly myeloma =dU <rf n>de„.. bovine and human ortgm. Usually, ra. or 
^usemyelo^oeBltaesareen^yei mhybridomacenscanbeculMredma 
am»blecul«»medi,m>*a.p«.«-..yco«ai,.oneormoresnbs«„cesmatrnh.W^ 

« g.o«h«rs«vi™.o,«.e™fu.ed.imm«*=dce«s. ^ 

Uct the «>zyme h,poxan.W» gu^nte ph«^odbosyl t»nsfe.ase (HCSPRT or HPRT), 
thecul«nemediumfor.heh,bdd»nastypicaUywfflincludeh,po,antMne,am^^ 

and thymidine ("HAT nte-fivm."). «hich substances prevent (he grovrth of HOPRT- 

deficient cells. 

30 Preferred immortalized cell Unes are those that fuse efficiently, support stable 

high level expression ofantibodyby the selected antibody-produdngcells, and axe 
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I^maOOl a9S4);Bn^c.a.M«.!«U^^ 

adiotau„u«>ass.y(RM)<.e„xyn«-liBkedimn.uno-»ort«». assay (H^^^^ 
.echni,»sa„dassays.«kno™ta*.a.t ,*»«»itog-B.*^-«f «»-<«^ 

,5 PoUarf AsaLBi^,m:220(1980). P^b.,.anttbodi«ha™«ah.ghdeg«eof 

^ificityandaWghbindi.gafflni.,for««Bre«»«i«»«'«>'^ 

Af«r*e<tesi^hybridomaceUs.mide«ifl«i,*»clo«scanb.subclo„«Jby 

„,.«„gdUu«o„p»o«tacsa„dg.»nbys»d..d™«K.a.S^«b|.c^^ 
^sp„^Mude,forexa^.c,D».^co.ModificdBag.e-sMed.um«.dI^-^^ 
^„m. Ato«<ivdy,fl>ch,brido™c.lIscanbegn.»ni.vivo.s.«.t.s«a 



20 

mamindl. 



U»a»»ock».l anybodies sec«,edW«« subclones canbeisolaKdorpur^ 
,^«„c*»n»di™o,asci»sfluidbyo— a,— glob^punfi^^ 
p««d„^su*.s.fore«n,U,P»«i»A-Sepha,os.hydroxy.apau«c^^ 

O'i aei electrophoresis, dialysis, or affinity chromatography. 

l:«».«,»a..nabodi«o.n^be™deby— a„.DKA™eU,«U,suoh 

.a*osedcscrib«iinU^.Pa.«*No.4.816^67. DNA «.oodi»g fte n»noclo». 
a„Ubodi«ofa«in«ntio„-b.«ada,isola»da»d..,ue««dusingconve„«o.^^ 

l.du™(e*.b,usiogo«g»»c.ec.a.p.besU«..>.cap*o,b,„d.„gsp^^^^^^^^ 

30 L^Jolg«»b=.vyandUgb.*.i»<.r.u.«ao*o.es,.Ue^^^^^^^ 

„,li„v»donserve.sap«.^s„-«eofs„*DNA. Once Uo.a«d.fl«DNA can 
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be placed into expression vectors, which arc then transfected into host cells such as 
simian COS cells, Chinese hamster ovary (CHO) cells, or myeloma cells that do not 
otherwise produce inmiunoglobulin protein, to obtam the syntiiesis of monoclonal 
antibodies in the recombinant host cells. The DNA also can be modified, for example, by 

5 substituting the coding sequence for human heavy and light chain constant domains in 
. place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 
368 , 812-13 (1994)) or by covalently joining to the inmiunoglobulin coding sequence all 
or part of the coding sequence for a non-immunoglobulin polypeptide. Such a non- 
immunoglobulin polypeptide can be substituted for the constant domiains of an antibody 

10 of the invention, or can be substituted for the variable domains of one antigen-combining 
site of an antibody of the invention to create a chimeric bivalent antibody. 

4.133 HUMANIZED ANTIBODIES 

The antibodies directed against the protein antigens of the invention can further 

15 comprise humanized antibodies or human antibodies. These antibodies are suitable for 
administration to humans without engendering an inmiune response by the human against 
the administer^ immunoglobulin. Humanized forms of antibodies are chimeric 
immunoglobulins, inomunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', 
F(ab')2 or other antigen-binding subsequences of antibodies) that are principally 

20 comprised of the sequence of a human inmiunoglobulin, and contain minimal sequence 
derived from a non-human immunoglobulin. Humanization can be performed following 
the method of Winter and co-workers (Jones et al.. Nature, 321:522-525 (1986); 
Riechmann et al.. Nature. 332:323-327 (1988); Veriioeyen et al., Science, 239:1534-1536 
(1988)), by substituting rodent CDRs or CDR sequences for the corresponding sequences 

25 of a human antibody. (See also U.S. Patent No. 5,225,539). In some instances, Fv 
framework residues of the human immunoglobulin are rqplaced by corresponding non- 
human residues. Humanized antibodies can also comprise residues that are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, 
the humanized antibody will comprise substantially all of at least one, and typically two, 

30 variable domains, in which all or substantially all of the CDR regions correspond to those 
of a non-human immunoglobulin and all or substantially all of the framework regions are 
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those of a human immunoglobulin consensus sequence. The humanized antibody 
optimally also will comprise at least a portion of an immunoglobulin constant region 
(Fc), typically that of a human immunoglobulin (Jones et al., 1986; Riechmann et al., 
1988; and Piesta, Curr, Op, Struct. Biol., 2:593-596 (1992)). 

5 

4A3A HUMAN ANTIBODIES 

Fully human antibodies relate to antibody molecules in which essentially the 
entire sequences of both the light chain and the heavy chain, including the CDRs, arise 
from human genes. Such antibodies ace termed **human antibodies", or *fully human 

10 antibodies" herein. Human monoclonal antibodies can be prepared by the trioma 
technique; the human B-ce!l hybridoma technique (see Kozbor, et al., 1983 Immunol 
Today 4: 72) and the EB V hybridoma technique to produce human monoclonal 
antibodies (see Cole, et al., 1985 In: MONOCLONAL Antboddss AND CANCER THERAPY, 
Alan R. Uss, Inc., pp. 77-96). Human monoclonal antibodies may be utilized in the 

15 practice of the present invention and may be produced by using human hybridomas (see 
Cote, et al.. 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by transforming human 
B-cells with Epstein Bair Virus in vitro (see Cole, et al.. 1985 In: Monoclonal 
Antibodies and Cancer Therapy, Alan R. Uss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 

20 including phage display libraries (Hoogenboom and Wint^, J. MoL BioL, ^2*381 

(1991); Marks et al,, LMol.Biol.. 222:581 (1991)). Similady, human antibodies can be 
made by introducing human immunoglobulin loci into transgenic animals, e.g., mice in 
which the endogenous immunoglobulin genes have been partially or completely 
inactivated. Upon challenge, human antibody production is observed, which closely 

25 resembles that seen in humans in all respects, including gene leanangement, assembly, 
and antibody repertoire. This approach is described, for exaiiq>le, in U.S. Patent Nos. 
5,545,807; 5.545.806; 5.569,825; 5,625,126; 5,633,425; 5,661,016, and in Marks et al. 
(Bio/Technology 10, 779-783 (1992)); Lonberg et al. (Nature 368 856-859 (1994)); 
Monison (Nature 368, 812-13 (1994)); Fishwildet al, (Nature Biotechnology 14. 845-51 

30 (1996)); Neuberger (Nature Biotechnology 14, 826 (1996)); and Lonberg and Huszar 
(Intern. Rev. Immunol. 13 65-93 (1995)). 
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Human antibodies may additionally be produced using transgenic nonhuman 
animals that are modified so as to produce fidly human antibodies rather than the 
animal's endogenous antibodies in response to challenge by an antigen. (See PCX 
publication WO94/0i2602). The endogenous genes encoding the heavy and light 
5 immunoglobulin chains in the nonhuman host have been incapacitated, and active loci 
encoding human heavy and light chain immunoglobulins aie inserted into the host's 
genome. The human genes are incorporated, for example, using yeast artificial 
chromosomes containing the requisite human DNA segments. An animal which provides 
all the desired modifications is then obtained as progeny by crossbreeding intermediate 

10 transgenic animals containing fewer than the full complement of the modifications. The 
preferred embodiment of such a nonhuman animal is a mouse, and is termed the 
Xenomouse™ as disclosed in PCX publications WO 96/33735 and WO 96/34096. This 
animal produces B cells that secrete fully human immunoglobulins. The antibodies can 
be obtained directly from the animal after immunization with an immunogen of interest, 

15 as, for example, a preparation of a polyclonal antibody, or altematively from 

immortalized B cells derived from the animal, such as hybridomas producing monoclonal 
antibodies. Additionally, the genes encoding the immunoglobulins with human variable 
regions can be recovered and expressed to obtain the antibodies directly, or can be furdier 
modified to obtain analogs of antibodies such as, for example, single chain Fv molecules. 

20 An example of a method of producing a nonhuman host, exemplified as a mouse, 

lacking expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. 
Patent No. 5,939,598. It can be obtained by a method including deleting the J segment 
gmes from at least one endogenous heavy chain locus in an embryonic stem cell to 
prevent rearrangement of the locus and to prevent formation of a transcript of a 

25 rearranged inununoglobulin heavy chain locus, the deletion being effected by a targeting 
vector containing a gene encoding a selectable marker, and producing from the 
embryonic stem cell a transgeiuc mouse whose somatic and germ ceUs contain the gene 
encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is 

30 disclosed in U.S. Patent No. 5,916,771. It includes introducing an expression vector that 
contains a nucleotide sequence encoding a heavy chain into one mammalian host ceU in 
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culture, introducing an expression vector containing a nucleotide sequence encoding a 
light chain into another mammalian host cell, and fusing the two cells to form a hybrid 
cell. The hybrid cell expresses an antibody containing the heavy chain and the hght 
chain. 

S In a further improvement on this procedure, a method for identifying a clinically 

relevant epitope on an inmiunogen, and a correlative method for selecting an antibody 
that binds immunospecifically to the relevant epitope with hi^ affinity, are disclosed in 
PCT publication WO 99/53049. 

10 4.13i FAB FRAGMENTS AND SINGLE CHAIN ANTIBODIES 

According to the invention, techniques can be adapted for the production of 
single-chain antibodies specific to an antigenic protein of the invention (see e.g., U.S. 
Patent No. 4,946,778). In addition, methods can be adapted for the construction of Fab 
expression libraries (see e.g., Huse, et al., 1989 Science 246: 1275-1281) to allow rapid 

15 and effective identification of monoclonal Fab fragments with the desired specificity for a 
protein or derivatives, fragments, analogs or homologs thereof. Antibody fragments that 
contain the idiotypes to a protein antigen may be produced by techniques known in the 
art including, but not limited to: (i) an F{ab')2 fragment produced by pepsin digestion of an 
antibody molecule; (ii) an Fab fragment generated by reducing the disulfide bridges of an 

20 F(ab<)2 fragment; (iii) an Fab fragment generated by the treatment of the antibody molecule 
with papain and a reducing agent and (iv) Fv fragments. 

4.13.6 BISFECmC ANTIBODIES 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies 
25 that have binding specificities for at least two different antigens. In the present case, one 
of (he binding specificities is for an antigenic protein of the invention. The second 
binding target is any other antigen, and advantageously is a cell-surface protein or 
receptor or receptor subunit 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
30 recombinant production of bispecific antibodies is based on the co-expression of two 
immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have 
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15 



different specificities (Milstein and Cuello. Natu^, 305:537-539 (1983)). Because of the 
random assortment of immunoglobulin heavy and Ught chains, these hybndomas 
(quadiomas) produce a potential mixture of ten different antibody molecules, of which 
onlyone has the coitcctbispecific structure. -nie purification of the correct molecule IS 

usually accomplished by affinity chromatography steps. Similar procedures are disclosed 
in WO 93/08829, pubUshed 13 May 1993, and in Tiaunecker et al, 1991 EMBO J., 
10:3655-3659. 

Antibody variable domains vnth the desired binding specificities (antibody- 
antigen combining sites) can be fused to immunoglobulin constant domain ^quences. 
Tht fusion preferably is with an immunoglobulin heavy<:hain constant domain, 
comprising atleastpartofthehinge.CH2,andCH3 regions. It is preferred to have ti.e 
first heavy-chain constant region (CHI) containing the site necessary for Ught-cham 
bindingpresentinatleastoneoftitefusions. DNAs encoding the immunoglobuhn 

heavy-chain fusions and, if desired, the immunoglobulin light chain, are inserted mto 
separateexpressionvectors,andareco.transfectedintoas«itablehostorganism. For 

further details of generatingbispecific antibodies see. for example. Sureshetal..^^ 

itiKnzvmology. 121-210 (1986). 

According to another approach described in WO 96/2701 1. the interface between 
a pair of antibody molecules can be engineered to maximize the percentage of 
heterodimersthatarerecoveredfromrecombinantcellculture. Hxe preferred interface 
comprisesatleastapartoftheCH3regionofanantibodyconstantdomain. Inttas 
method, oneor more small amino acid side chains fromtheinterface of thefirst antibody 

molecule are replaced witii larger side chains (e.g. tyrosine or tryptophan). 
Compensatory "cavities" of identical or similar size to tiie large side chain(s) are created 
i on tiie interface of the second antibody molecule by replacing large amino acid side 
chains witir smaUer ones (e.g. alanine or tiireonine). TWs provides a mechanism for 
increasing tiie yield of tiie heterodimer over other unwanted end-products such as 
homodimers. 

Bispecific antibodies can be prepared as full length antibodies or antibody 
0 fragments(e.g.F(ab'),bispecificantibodies).Techniquesforgeneratingbis^^^ 
antibodies from antibodyfragmentshavebeendescribedintheliteratuxe. Fbrexample, 
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bispecificantibodi^ can b. prepared using chemicd linkage. B™n.ne.al,Saa« 

229-81 <1M5) describe a procedure »herein inuc, antibodies are p.oteol)*can, cle.v«I 
«,^F(ab-).fra8n^.s.Tl.ese fragment arereduced in .heprese^eofftcdrdno. 

comptexingag«.tsodiumarseni.c»s*ilizev,cinaldiWolsandp,ey.n.i»Bnnolec«ta 
5 disulfidetotmatk«.TheFab-fraEn«ntsgeneratedarethenconvettedto 

duorittobenzo*. (TOB) deriv«iv«>. One of Fab'.TNB derivative, is dren 
averted to theFd>--«^«lby.eduction»id,mercap.oeti,ylanuneandisn»xed».ft^ 

e™»,br«no«n.otd«otf«rF.b^TOT derivative «>formti.=bisp^^^^ 

bispedfic antibodies prcd»c«.ca.beusedasagen«ford»sele«iv.innnobilization^ 

10 enzymes. 

Additionally. F*' fagments am be directiy «ove«d from B. coll and 
chen>icallycoupled«.formbispecifica«ibodies. Shalabyetal..LSEJ^ 
175 217-225 (1992) describe Iheproduction of afuUy hnnam^ Wspecific •"til"'* 
F(ab')= molecule. Each Fab' ftagmen. was separably sec«d ftom £ coli and subjec^d 
15 ,o^rec.edchemicalcoupUnginvitio«,fonnti»bispeci.ic-«,body. Tbebispeafe 
antibody flius formed «as able «,bind »> cells ove^xpressing d«= E*B2 recep»r -Hi 
normal human T cells, as well as .rigger tt« lytic acttvifl, of human cytoBxrc 

lymphocytes against human breast tumor targets. 

Varioustechniquesform.ldngandisolatinghispedfic«.tibod,fragm«its 

20 ditecUyfromreccmbin^tcellculmrehavealsobeendescrihed. Forex^plcMspecfic 
.^T«dies have been produced usingleudne zippers. K„stetayetd..LWDd. 

148(5)-1547.1553 (1992). m leucine zipper peptides from the EOS «Kl to 
^linW totheFab- portions of two different antibcdiesb,geneMo...m««.^^^ 
bc™odu.«rswe,ereduced«ti.ehinge.egion.ofo™monomersandthenre-.«d«ed«, 

25 fo,mti.eantibod,hete,^™smefl.odcanalsoheutiliz^ford.eproductio.of 
antibodyhomodin«s. The "diabod^-tedmology described by Bollinger e.al.,Bs^ 

M... WSd.USA 90:6444^ (1993) has provided an alternative mechanism for 
™kingbispecificantib.d,fmgmen.s.thefagmen.scomprise.heavy-chainvanable 

aomam(V,0com,ectedto.ligh.<hainv,riahledomain(VObyalh*erwhich.s,oo 

30 shorttoanowp-ringh^we^tthetwodtMnainsonthes^nechda Accordrngly. Ure V„ 
^dV.domainsofonetegme„...ef«cedtop*wi«.d«compl««ntaryV.«.dV„ 
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domains of a„oteftagm=n,*«ebyfonMng™o»>ttg«.*ndingd«s. 

^^^^^^ Sec.O™^re.a...«L.«^53«a'« 
Antibodies with mon: ftan wo valencies arc contempl«ed. For exiO^ 

5 aispedflc antibodies can be prepared To«e.al..LfcM^ 

E«mpl..ybisp.aficanttbodiesca„bind».wodiffere„,epi»p=s.a.l.a«on.of 

«Uehori^na»si.a»pro«in.ntieenofti.invention.Al— y-ananti^^^^^ 
^of ani™nnnoglob*mo..culeeanbec„n,bined»i* anannwtaehbr^^^^ 

tfggering molecnle o. a leukoc,« s»ch «> a T^l «cep.or molecule (e.g. CDi ^3, 
10 ^.orB7).»,..«:ep«»a,or,gO(Fc,K).sueb.sB^R.(CDM).BcrBB<CD3« 
Ife,Rin(CD16)«..a«.focua<»Uul.rdefensen«cl-ams.o*ecellexp.ess.ng*e 
particular antigen. Biapedfic antibodies can also Unsedio direct cu,«»ic. genu » 
L,lswhichexp,ess.paHicnta».tige.. Itee antibodies possess ».antige.*nd»g 
armandanannwhichbin*.c,«>U»ic.g«..ora.«fion.cUdecMa»r.suchas^ 
,5 E0rUBE,DFrA.DOrA,or-IBrA. Ano*.,bispecific«.tibod,ofin»«s.bmdsti» 
proBin antigen described hcBin ««1 fn.ti« binds tissue fcctor cm. 

4137 HETEROCOKTOGATEANnBOBIES 

He.«oco„juga.e antibodies are also witiun ti» »=ope of *e P--';~ 
00 Heteroconiuga,. antibodies are composed of .«ooov.lenUyl6inedaB.b.a.es. Su* 

^bodiesbL.forexan,le.bee„P»i-^.o«rge.immunes.«mc^^^ 

ceus (O^. P.««. NO. 4 a^i for tiea»enf of fflV inf^on (WO 9V^^ 

WO 92««373; BP 03089). « is co„«mpla«a ^ tire antibodies c« be p»p.«d » 

^ crossBn.4agen.,^e^W,i.mnm,o,o,inscanbeco— 
e^ge reaction or byft^ningatitioetirer bond. E'-^'" °' 
purpj include iminofl^o* and meti.,l^me,cap»bu„.imida« and ti,ose drsclosed. 
for example, in U.S. PatentNo. 4,676.980. 
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It can be desirable to modify the antibody of the invention with respect to effector 
function, so as to enhance, e.g., the effectiveness of the antibody in treating cancer. For 
example, cysteine residue(s) can be introduced into the Fc region, thereby allowmg 
interchain disulfide bond formation in this region. The homodimeric antibody thus 
generated can have improved internalization capability and/or increased complement- 
mediated cell kiUing and antibody-dependent cellular cytotoxicity (ADCC). See Caron 
et al., J. Exp Med., 176: 1191-1195 (1992) and Shopes, J. Immunol., 148: 2918-2922 
(1992) Homodimeric antibodies with enhanced anti-tumor activity can also be prepared 
using heterobifunctional cross-linkers as described in Wolff et al. Cancer Research. 53: 
2560-2565 (1993). Alternatively, an antibody can be engineered that has dual Fc 
regions and can thereby have enhanced complement lysis and ADCC capabiUties. See 
Stevenson et al.. Anti-Cancer Drug Design, 3: 219-230 (1989). 

4.13.9 DMMUNOCONJUGATES 

15 The invention also pertains to unmunoconjugates comprising an antibody 

conjugated to a cytotoxic agent such as a chemotherapeutic agpnt, toxin (e.g., an 
enzymatically active toxin of bacterial, fungal, plant, or animal origin, or fragments 
thereof), or a radioactive isotope (i.e., a radioconjugate). 

Chemotiierapeutic agents useful m the generation of such immunoconjugates have 
20 been described above. Enzymatically active toxins and fragments thereof that can be 
used include diphtiieria A chain, nonbinding active fragments of diphtheria toxm. 
exotoxin A chain (from Pseudomonas aeruginosa), ricin A chain, abrin A chain, 
modeccin A chain, alpha-sarcin, Aleurites fordii proteins, dianthin proteins, Phytolaca 
americanaprotrins (PAPI. PAPH. andPAP-S). momordicacharantia inhibitor, cuicin. 
crotin. sapaonaria officinalis inhibitor, gelonin. mitogellin, resttictocin, phenomycm. 
enomycin. and the tricothecenes. A variety of radionucUdes ^^^^'^'^^^J^' ^ 
productionofradioconjugatedantibodies. Examples include ^^^Bi. ^^'l. "V "Y.and 



25 



Conjugates of tiie antibody and cytotoxic agent are made using a variety of 
30 bifunctional protein-coupling agents such as N-succinimidyl-3-(2-pyridyldiduol) 

propionate (SPDP). iminothiolane m bifunctional derivatives of imidoesters (such as 
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dimethyl adipimidate HQL), active esters (such as disucciniimdyl suberate), aldehydes 
(such as glutareldehyde), bis-azido compounds (such as bis (p-azidobenzoyl) 
hexanediamine), bis-diazonium derivatives (such as bis-(p-diazoniumbenzoyl)' 
ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), and bis-active 
5 fluorine compounds (such as l,S-difluoro-2,4-dinitrobenzene). For example, a ricin 
immunotoxin can be prepared as described in Vitetta et al., Science, 238: 1098 (1987). 
. Carbon-14-labeled l-isothiocyanatobenzyl-3-metiiyldiethylene triaminepentaacetic acid 
(MX-DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the 
antibody. See W094/1 1026. 
10 In another embodiment, the antibody can be conjugated to a "receptor" (such 

streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate 
is administered to the patient, followed by removal of unbound conjugate from the 
circulation using a clearing agent and then administration of a "ligand" (e.g., avidin) that 
is in turn conjugated to a cytotoxic agent 

15 

4.14 COMPUTER READABLE SEQUENCES 
In one application of this embodiment, a nucleotide sequence of the present 
invention can be recorded on computer readable media. As used herein, "computer 

20 readable media" refers to any medium which can be read and accessed directly by a 
computer. Such media include, but are not limited to: magnetic storage media, such as 
floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as 
CD-ROM; electrical storage media such as RAM and ROM; and hybrids of tiiese 
categories such as magnetic/optical storage media. A skilled artisan can readily 

25 appreciate how any of the presentiy known computer readable mediums can be used to 
create a manufacture comprising computer readable medium having recorded thereon a 
nucleotide sequence of the present invention. As used herein, "recorded" refers to a 
process for storing information on computer readable medium. A skilled artisan can 
readily adopt any of the presentiy known methods for recording information on computer 

30 readable medium to generate manufactures comprising the nucleotide sequence 
information of the present invention. 
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A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means 
chosen to access the stored information. In addition, a variety of data processor programs 
5 and formats can be used to store the nucleotide sequence information of the present 
invention on computer readable medium. The sequence information can be represented 
in a word processing text file, formatted in commercially-available software such as 
WordPerfect and Microsoft Word, or represented in the form of an ASC3I file, stored in a 
database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can 

10 readily adapt any number of data processor structuring formats (e.g. text file or database) 
in order to obtain computer readable medium having recorded thereon the nucleotide 
sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NOs: 1 - 438 or a 
representative fragment thereof; or a nucleotide sequence at least 95% identical to any of 

15 the nucleotide sequences of SEQ ID NOs: 1 - 438 in computer readable form, a skilled 
artisan can routinely access the sequence information for a variety of purposes. 
Computer software is publicly available which allows a skilled artisan to access sequence 
information provided in a computer readable medium. The examples which follow 
demonstrate how software which implements the BLAST (Altschul et al., J. Mol. Biol. 

20 215:403^10 (1990)) and BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) 
search algorithms on a Sybase system is used to identify open reading fiaines (ORFs) 
within a nucleic acid sequence. Such ORFs may be protein encoding fi:agments and may 
be useftil in producing commercially important proteins such as enzymes used in 
fermentation reactions and in the production of cormn^dally useful metabolites. 

25 As used herein, "a computer-based system" refers to the hardware means, 

software means, and data storage means used to analyze the nucleotide sequence 
information of the present invention. The minimum hardware means of the 
computer-based systems of the present invention comprises a central processing unit 
(CPU), input means, output means, and data storage means. A skilled artisan can readily 

30 appreciate that any one of the currently available computer-based systems are suitable for 
use in the present invention. As stated above, the computer-based systems of the present 
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invention comprise a data storage means having stored therein a nucleotide sequence of 
the present invention and the necessary hardware means and software means for 
supporting and implementing a search means. As used herein, "data storage means" 
refers to memory which can store nucleotide sequence information of the present 
5 invention, or a memory access means which can access manufactures having recorded 
thereon the nucleotide sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are 
implemented on the computer-based system to compare a target sequence or target 
stmctural motif with the sequence information stored within the data storage means. 

10 Search means are used to identify fragments or regions of a known sequence which 
match a particular target sequence or target motif. A variety of known algorithms are 
disclosed publicly and a variety of commercially available software for conducting search 
means are and can be used in the computer-based systems of the present invention. 
Examples of such software includes, but is not limited to, Smith- Waterman, MacPattem 

15 (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A skilled artisan can readily 
recognize that any one of the available algorithms or implementing software packages for 
conducting homology searches can be adapted for use in the present computer-based 
systems. As used herein, a "target sequence" can be any nucleic acid or amino add 
sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 

20 readily recognize that the longer a target sequence is, the less likely a target sequence will 
be present as a random occurrence in the database. The most preferred sequence length 
of a target sequence is from about 10 to 300 amino acids, more preferably from about 30 
to 100 nucleotide residues. However, it is well recognized that searches for 
commercially important fragments, such as sequence fragments involved in gene 

25 expression and protein processing, may be of shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any 
rationally selected sequence or combination of sequences in which the sequence(s) are 
chosen based on a three-dimensional configuration which is formed upon the folding of 
the target motif. There are a variety of target motifs known in the art. Protein target 

30 motifs include, but are not limited to, enzyme active sites and signal sequences. Nucleic 
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add target motifs include, but are not limited to, promoter sequences, hairpin structures 
and inducible expression elements (protein binding sequences). 

4.15 TRIPLE HELIX FORMATION 

5 In addition, the fragments of the present invention, as broadly described, can be 

used to control gene expression through triple helix formation or antisense DNA or RNA, 
both of which methods are based on the binding of a polynucleotide sequence to DNA or 
RNA, Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in 
length and are designed to be complementary to a region of the gene involved in 

10 transcription (triple helix - see Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al.. 
Science 15241:456 (1988); and Dervan et al.. Science 251:1360 (1991)) or to the mRNA 
itself (antisense - Olmno, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as 
Antisense Inhibitors of Gene Expression, CRC Press, Boca Raton, FL (1988)). Triple 
helix-formation optimally results in a shut-off of RNA transcription from DNA, while 

15 antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. 
Both techniques have been demonstrated to be effective in model systems. Infomiation 
contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

20 4J6 DUGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or 
expression of one of the ORFs of the present invention, or homolog thereof, in a test 
sample, using a nucleic acid probe or antibodies of the present invention, optionally 
conjugated or otherwise associated with a suitable label. 

25 In general, methods for detecting a polynucleotide of the invention can comprise 

contacting a sample with a compound that binds to and forms a complex with die 
polynucleotide for a period sufficient to form the complex, and detecting the complex, so 
that if a complex is detected, a polynucleotide of the invention is detected in the sample- 
Such methods can also comprise contacting a sample under stringent hybridization 

30 conditions with nucleic acid primers that anneal to a polynucleotide of the invention 
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under such conditions, and amplifying annealed polynucleotides, so that if a 
polynucleotide is amplified, a polynucleotide of the invention is detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the 
5 . polypeptide for a period sufficient to form the complex, and detecting the complex, so 
that if a complex is detected, a polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 
antibodies or one or more of the nucleic acid probes of the present invention and assaying 
for binding of the nucleic acid probes or antibodies to components within the test sample. 

10 Conditions for incubating a nucleic acid probe or antibody with a test sample 

vary. Incubation conditions depend on the format employed in the assay, the detection 
methods employed, and the type and nature of the nucleic acid probe or antibody used in 
the assay. One skilled in the art will recognize that any one of the commonly available 
hybridization, amplification or immunological assay formats can readily be adapted to 

15 employ the nucleic acid probes or antibodies of the present invention. Examples of such 
assays can be found in Chard, T., An Introduction to Radioimmunoassay and Related 
Techniques, Elsevier Science Publishers, Amsterdam, The Netherlands (1986); Bullock, 
G.R. et al.. Techniques in Immunocytochemistry, Academic Press, Orlando, FL Vol. 1 
(1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice and Theory of immunoassays: 

20 Laboratory Techniques in Biochemistry and Molecular Biology, Elsevi^ Science 
Publishers, Amsterdam, The Netherlands (1985). The test samples of the present 
invention include ceUs, protein or membrane extracts of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The test sample used in the above-described 
method will vary based on the assay format, nature of the detection method and the 

25 tissues, cells or extracts used as the sample to be assayed. Methods for preparing protein 
extracts or membrane extracts of cells are well known in the art and can be readily be 
adapted in order to obtain a sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain 
the necessary reagents to carry out the assays of the present invention. Specifically, the 

30 invention provides a compartment kit to receive, in close confinement, one or more 
containers which comprises: (a) a first container comprising one of the probes or 
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antibodies of the present invention; and (b) one or more other containers comprising one 
or more of the following: wash reagents, reagents capable of detecting presence of a 
bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in 
S separate containers. Such containers include small glass containers, plastic containers or 
strips of plastic or paper. Such containers allows one to efficiently transfer reagents from 
one compartment to another compartment such that the samples and reagents are not 
cross-contaminated, and the agents or solutions of each container can be added in a 
quantitative fashion from one compartment to another. Such containers will include a 

10 container which will accept the test sample, a container which contains the antibodies 
used in the assay, containers which contain wash reagents (such as phosphate buffered 
saline, Tris-buffers, etc.), and containers which contain the reagents used to detect the 
bound antibody or probe. Types of detection reagents include labeled nucleic acid probes, 
labeled secondary antibodies, or in the alternative, if the primary antibody is labeled, the 

15 enzymatic, or antibody binding reagents which are capable of reacting with the labeled 
antibody. One skilled in the art will readily recognize that the disclosed probes and 
antibodies of the present invention can be readily incorporated into one of the established 
kit formats which are well known in the art. 

20 4.17 MEDICAL IMAGING 

Hie novel polypeptides and binding partners of the invention are useful in 
medical imaging of sites expressing the molecules of the invention (e.g., where the 
polypeptide of the invention is involved in the inmiune response, for imaging sites of 
inflammation or infection). See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such 
25 methods involve chemical attachment of a labeling or imaging agent, administration of 
the labeled polypeptide to a subject in a phamiaceutically acceptable carrier, and imaging 
the labeled polypeptide in vivo at the target site. 

4,18 SCREENING ASSAYS 
30 Using the isolated proteins and polynucleotides of the invention, the present 

invention further provides methods of obtaining and identifying agents which bind to a 
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polypeptide encoded by an ORF corresponding to any of the nucleotide sequences set 
forth in SEQ ID NOs: 1 - 438, or bind to a specific domain of the polypeptide encoded by 
the nucleic acid. In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORF of the 
5 present invention, or nucleic acid of tiie invention; and 

(b) determining wheth^ the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a 
polynucleotide of the invention for a time sufficient to form a polynucleotide/compound 

10 complex, and detecting the complex, so that if a polynucleotide/compound complex is 
detected, a compound that binds to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind 
to a polypeptide of the invention can comprise contacting a compound with a polypeptide 
of the invention for a time sufficient to form a polypeptide/compound complex, and 

15 detecting the complex, so that if a polypeptide/compound complex is detected, a 
compound that binds to a polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention 
can also comprise contacting a compound with a polypeptide of the invention in a cell for 
a time sufficient to form a polypeptide/compound complex, wheiein the complex drives 

20 expression of a receptor gene sequence in the cell, and detecting the complex by 

detecting reporter gene sequence expression, so that if a polypeptide/compound complex 
is detected, a compound that binds a polypeptide of the invention is identified 

Compounds identified via such methods can include compounds which modulate 
the activity of a polypeptide of the invention (that is, increase or decrease its activity, 

25 relative to activity observed in the absence of the compound). Alternatively, compounds 
identified via such methods can include compounds which modulate the expression of a 
polynucleotide of the invention (that is, increase or decrease expression relative to 
expression levels observed in the absence of the compound). Compounds, such as 
compounds identified via the methods of the invention, can be tested using standard 

30 assays well known to those of skill in the art for their ability to modulate 
activity/expression. 
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The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be 
selectlsd and screened at random or rationally selected or designed using protein modeling 
techniques. 

5 For random screening, agents such as peptides, carbohydrates, phannaceutical 

agents and the like are selected at random and are assayed for their ability to bind to the 
protein encoded by the ORF of the present invention. Alternatively, agents may be 
rationally selected or designed. As used herein, an agent is said to be "rationally selected 
or designed" when the agent is chosen based on the configuration of the particular 

10 protein. For example, one skilled in the art can readily adapt currently available 

procedures to generate peptides, pharmaceutical agents and the like, capable of binding to 
a specific peptide sequence, in order to generate rationally designed antipeptide peptides, 
for example see Hurby et al.. Application of Synthetic Peptides: Antisense Peptides," In 
Syntiietic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 

15 Kaspczak et al.. Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 
In addition to the foregoing, one class of agents of the present invention, as 
broadly described, can be used to control gene expression through binding to one of the 
ORFs or EMFs of the present invention. As described above, such agents can be 
randomly screened or rationally designed/selected. Targeting the ORF or EMF allows a 

20 skilled artisan to design sequence specific or element specific agents, modulating the 
expression of either a single ORF or multiple ORFs which lely on the same EMF for 
expression control. One class of DNA binding agents are agents which contain base 
residues which hybridize or form a triple helix formation by binding to DNA or RNA. 
Such agents can be based on the classic phosphodiester, ribonucleic acid backbone, or 

25 can be a variety of sulfhydiyl or polymeric derivatives which have base attachment 
capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple 
helix - see Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241:456 
30 (1988); and Dervan et al.. Science 251:1360 (1991)) or to the mRNA itself (antisense - 
Okano, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of 
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Gene Expression, CRC Press, Boca Raton, EL (1988)), Triple helix-foimation optimally 
results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization 
blocks translation of an mRNA molecule into polypeptide. Both techniques have been 
demonstrated to be effective in model systems. Information contained in the sequences 
5 of the present invention is necessary for the design of an antisense or triple helix 
oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present 
invention can be used as a diagnostic agent. Agents which bind to a protein encoded by 
one of the ORFs of the present invention can be formulated using known techniques to 
10 generate a pharmaceutical composition. 

4,19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific 
nucleic acid hybridization probes capable of hybridizing with naturally occurring 

15 nucleotide sequences. The hybridization probes of the subject invention may be derived 
from any of the nucleotide sequences SEQ ID NOs: 1 - 438. Because the corresponding 
gene is only expressed in a limited number of tissues, a hybridization probe derived from 
of any of the nucleotide sequences SEQ ID NOs: 1 - 438 can be used as an indicator of 
the presence of RNA of cell type of such a tissue in a sanq>le. 

20 Any suitable hybridization technique can be employed, such as, for example, in 

situ hybridization. PGR as described in US Patents Nos. 4,683,195 and 4,965,188 
provides additional uses for oligonucleotides based upon the nucleotide sequences. Such 
probes used in PGR may be of recombinant origin, may be chemically synthesized, or a 
mixture of both. The probe will comprise a discrete nucleotide sequence for die detection 

25 of identical sequences or a degenerate pool of possible sequences for identification of 
closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include 
the cloning of nucleic acid sequences into vectors for the production of mRNA probes. 
Such vectors are known in the art and are commercially available and may be used to 

30 synthesize RNA probes in vitro by means of the addition of the appropriate RNA 
polymerase as T7 or SP6 RNA polymerase and the appropriate radioactively labeled 
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nucleotides. The nucleotide sequences may be used to construct hybridization probes for 
mapping their respective genomic sequences. The nucleotide sequence provided herein 
may be mapped to a chromosome or specific regions of a chromosome using well known 
genetic and/or chromosomal mapping techniques. These techniques include in situ 

5 hybridization, linkage analysis against known chromosomal markers, hybridization 
screening with libraries or flow-sorted chromosomal preparations specific to known 
chromosomes, and the like. The technique of fluorescent in situ hybridization of 
chromosome spreads has been described, among other places, in Verma et al (1988) 
Human Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

10 Huorescent in situ hybridization of chromosomal preparations and other physical 

chromosome mapping techniques may be correlated with additional genetic map data. 
Examples of genetic map data can be fouiid in the 1994 Genome Issue of Science 
(265: 198 If). Correlation between the location of a nucleic acid on a physical 
chromosomal map and a specific disease (or predisposition to a specific disease) may 

15 help delimit tiie region of DNA associated with that genetic disease. The nucleotide 
sequences of the subject invention may be used to detect differences in gene sequences 
between normal, carrier or affected individuals. 

4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic add segments, may be readily prepared by, for 
20 example, direcfly synthesizing tiie oligonucleotide by chemical means, as is commonly 

practiced using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to 

those of skill in the art using any suitable support such as glass, polystyrene or Teflon. One 

strategy is to precisely spot oligonucleotides synthesized by standard synthesizers. 
25 Immobilization can be achieved using passive adsorption (Ihouye & Hondo, (1990) J. CUn. 

Microbiol. 28(6) 1469-72); using UV light (Nagata et al, 1985; Dahlen et al., 1987; 

Morrissey & Collins, (1989) Mol. Cell Probes 3(2) 189-207) or by covalent binding of base 

modified DNA (Keller et a/., 1988; 1989); all references being specifically incorporated 

herein. 

30 Another strategy that may be employed is the use of the strong biotin-streptavidin 

interaction as a Unker. For example, Broude et at. (1994) Proc. Nafl. Acad. Sci. USA 91(8) 
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3072-6 describe the use of biotinylated probes, although these aie duplex probes, that are 
inunobili2edor»streptavidin<oatedrnagneticbeads.Streptavid^^^^ 
purchasedfromDynal, Oslo. Of course, this same Mngchenustry is appUcablet^ 

anysurfacewithstreptavidin. Biotinylated probes may be purchased ftom various sources. 

5 such as, e.g., Operon Technologies (Alameda, CA). 

Nunc Laboratories (Naperville. IL) is also selling suitable material that could be 
used Nunc Laboxtitories have developed a method by which DNA can be covalendy bound 
tothemicrowellsurfacet^medCovahnkNH. CovaUnkNH is a polystyrene surface 
gifted withsecondaryanunogroups(>NH) that serve asbridge-heads for fu^^^ 

0 coupling. CovaUnkModulesrnaybepun:hasedfromNuncUbot«ories. DNAmolecules 
niaybebound.oCovaUnkexclusivelyatthe5'-endbyaphosphoramidatebond.aUowmg 
ta„„obilizationofmorethanlpmolofDNACRasmussen.r.I.(1991)Anal.Biochem. 

198(1)138-42). , <- A 

Tl^euseofC^valinkNHstripsforcovalentbindingofDNAmoleculesattheS-end 

15 hasbeende^ribed(Rasn..ssenetaL.(1991).tothistechnolc,gy.aphosph^^^ 
isemployed(Chuetal.,(1983)NucldcAddsRes.ll(8)6513.29).Thisisbeneficialas 

in^obilizationusingonlyasinglecovalentbondispreferred. The phosphoramidate bond 
joins theDNA to the CovaUnkNHsecondary amino groups .hat ai^positionedattheend 

ofspacerannscovalentlygraftedontothepolystyrenesurfacethrougha2nmlong 
20 arm.TolinkanoligonucleotidetoCovaUnkNHviaanphosphoramidatebond.the 

oUgonucleotide terminus musthavea5--endphosphategK,up.Itis.pe*aps,evenpossxble 
forbiotintobecov^entlyboundtoCovaUr^andthenstreptavidinusedtobmdtheprob^^^ 

More specifically, the Unkage method includes dissolving DNA in water (7^ ng/ul) 
and denaturingforlOmin. at 95°C and cooling on ice for 10min.lce-coldO.lM 

25 l-methylimidazole. pH 7.0 (1-MeIm,), is then added to a final concentration of 10 mM 
l-Mdm,. A ss DNA solution is then dispensed into Covalink NH strips (75 ul/well) 

standmgCHiice. /irnr-* 
Caibodiimide 0.2 M l.ethyl-3-(3-dimethylamlnopropyl)-carbodmmde (EDC). 

dissolvedinl0mMl-MeIm„ismadefieshand25uladdedperweD. Thestripsare 

30 incubatedforShouisatSO^. After incubation the strips are washed usmg. e.g., 

Nunc.lmmunoWash;&sttheweIIs are washedStimes, then they are soaked with wa^^^ 
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solution for 5 min., and finally they are washed 3 times (where in the washing solution is 0.4 
N NaOH, 0.25% SDS heated to 50^C). 

It is contemplated that a further suitable method for use with the present invention is 
that described in PCX Patent Application WO 90/03382 (Southan & Maskos), incorporated 
5 heidn by reference, This method of prqjaiing an oligonucleotide bound to a support 
involves attaching a nucleoside 3 -reagent through the phosphate group by a covalent 
phosphodiester link to aliphatic hydroxyl groups carried by the support. The 
oligonucleotide is then synthesized on the supported nucleoside and protecting groups 
removed from the synthetic oligonucleotide chain under standard conditions diat do not 

10 cleave the oligonucleotide from the support. Suitable reagents include nucleoside 
phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA 
probe arrays may be employed For example, addressable laser-activated photodeprotection 
may be employed in the chemical synthesis of oligonucleotides direcdy on a glass surface, 

15 as described by Fodor ei al (1991) Science 251(4995) 767-73, incorporated herein by 

reference. Probes may also be immobilized on nylon supports as described by Van Ness et 
al, (1991) Nucleic Acids Res. 19(12) 3345-50; or linked to Teflon using the method of 
Duncan & Cavalier (1988) Anal. Biochem. 169(1) 104-8; ail references being specifically 
incorporated herein. 

20 To link an oligonucleotide to a nylon support, as described by Van Ness et al. 

(1991), requires activation of the nylon surface via alkylation and selective activation of the 
S -amine of oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
light-generated synthesis described by Pease et d., (1994) PNAS USA 91(1 1) 5022-6, 

25 incoiporated herein by reference). These authors used current photolidiographic techniques 
to generate arrays of immobilized oligonucleotide probes (DNA chips). These methods, in 
which li^t is used to direct the synthesis of oligonucleotide probes in high-density, 
miniaturized arrays, utilize photolabile 5 -protected N-acyl-deoxynucleoside 
phosphoramidites, surface linker chemistry and versatile combinatorial synthesis strategies. 

30 A matrix of 256 spatially defined oligonucleotide probes may be generated in this manner. 
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421 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, 
genomic DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC 
inserts, and RNA, including mRNA without any amplification steps. For example. 

5 Sambrook et d. (1989) describes three protocols for the isolation of high molecular weight 
DNA from mammalian cells (p. 9.14-9.23). 

DNA fragments may be prepared as clones in M13, plasmid or lambda vectors 
and/or prepared directly from genomic DNA or cDNA by PGR or other amplification 
methods. Samples may be prepared or dispensed in multiwell plates. About 100-1000 ng of 

10 DNA samples may be prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those 
of skill in the art including, for example, using restriction enzymes as described at 9.24-9.28 
of Sambrook et cd. (1989), shearing by ultrasound and NaOH treatment. 

Low pressure shearing is also appropriate, as described by Schriefer et al. (1990) 

15 Nucleic Acids Res. 18(24) 7455-6, incorporated herein by reference). In this method, DNA 
samples arc passed through a small French pressure cell at a variety of low to intermediate 
pressures. A lever device allows controlled application of low to intermediate pressures to 
the cell. Hie results of these studies indicate that low-pressure shearing is a useful 
alternative to sonic and enzymatic DNA fragmentation methods. 

20 One particulariy suitable way for fragmenting DNA is contemplated to be that using 

the two base recognition endonuclease, CvOl described by Htzgaald et al. (1992) Nucleic 
Adds Res. 20(14) 3753-62. These authors described an approach for the r^id 
fragmentation and fractionation of DNA into particular sizes that they contemplated to be 
suitable for shotgun cloning and sequencing. 

25 The restriction endonuclease CViOI normally cleaves the recognition sequence 

PuGCPy between the G and C to leave blunt ends. Atypical reaction conditions, which alter 
the specificity of this enzyme (Cvfll**), yield a quasi-random distribution of DNA 
fragments form the small molecule pUC19 (2688 base pairs). Fitzgerald et al (1992) 
quantitatively evaluated the randomness of this fragmentation strategy, using a CvOI** 
30 digest of pUC19 that was size fractionated by a rapid gel filtration method and directly 
ligated, without end repair, to a lac Z minus M13 cloning vector. Sequence analysis of 76 
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clones showed that CViJI** restricts pyGCPy and PuGCPu, in addition to PuGCPy sites, and 
that new sequence data is accumulated at a rate consistent with random fragmentation. 

As reported in the literaturc, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-O.S ug instead 
5 of 2-5 ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or 
agarose gel electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or 
prepared, it is important to denature the DNA to give single stranded pieces available for 
hybridization. This is achieved by incubating the DNA solution for 2-5 minutes at 80-90*^C. 
10 The solution is then cooled quickly to 2°C to prevent renaturation of the DNA fragments 
before they are contacted with die chip. Phosphate groups must also be removed from 
genomic DNA by methods known in the art. 

422 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon 

15 membrane. Spotting may be performed by using arrays of metal pins (the positions of which 
correspond to an array of weUs in a microtiter plate) to repeated by transfer of about 20 nl of 
a DNA solution to a nylon membrane. By offset printing, a density of dots higher than the 
density of the wells is achieved One to 25 dots may be accommodated in 1 nmi , 
depending on the type of label used. By avoiding spotting in some preselected number of 

20 rows and columns, separate subsets (subarrays) may be formed. Samples in one subarray 
may be the same genomic segment of DNA (or the same gene) from different individuals, or 
may be different, overlapped genomic clones. Each of the subarrays may represent replica 
spotting of the same samples. In one example, a selected gene segment may be amplified 
from 64 patients. For each patient, the amplified gene segment may be m one 96-well plate 

25 (all 96 wells containing the same sample). A plate for each of the 64 patients is prepared. By 
using a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. Subarrays 
may contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm^ and there may be a 1 mm space between subarrays. 

Another ^proach is to use membranes or plates (available from NUNC, Naperville, 

30 Illinois) which may be partitioned by physical spacers e.g. a plastic grid molded over the 
membrane, the grid being similar to the sort of membrane applied to the bottom of multi well 
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plates, or hydrophobic strips. A fixed physical spacer is not preferred for imaging by 
exposure to flat phosphor-storage screens or x-ray films. 

Hie present invention is illustrated in the following examples. Upon consideration 
of the present disclosure, one of skill in the art will appreciate that many other embodiments 

5 and variations may be made in the scope of the present invention. Accordingly, it is 

intended that the broader aspects of the present invention not be limited to the disclosure of 
the following examples. The present invention is not to be limited in scope by die 
exemplified embodiments which arc intended as illustrations of single aspects of the 
invention, and compositions and methods which are functionally equivalent are within the 

10 scope of the invention. Indeed, numerous modifications and variations in the practice of the 
invention are expected to occur to those skiDed in the art upon consideration of the present 
preferred embodiments. Consequently, the only limitations which should be placed upon 
the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specification are hereby 

15 incorporated by reference in their entirety. 

5.0 EXAMPLES 

5.1 EXAMPLE 1 

Novel Nucleic Acid Seouenccs Obtained From Various Libraries 
A plurality of novel nucleic acids were obtained fix>m cDNA libraries prepared ficom 
20 various human tissues and in some cases isolated from a genomic library derived ficom 
human chromosome using standard PGR, SBH sequence signature analysis and Sanger 
sequencing techniques. The inserts of the library were amplified with PGR using primers 
specific for the vector sequences which flank the inserts. Clones from cDNA libraries were 
spotted on nylon membrane filters and screened with oligonucleotide probes (e.g., 7-mers) 
25 to obtain signature sequences. The clones were clustered into groups of similar or identical 
sequences. Representative clones were selected for sequencing. 

In some cases, the 5' sequence of the amplified inserts was then deduced using a 
typical Sanger sequencing protocol. PGR products were purified and subjected to 
fluorescent dye traminator cycle sequencing. Single pass gel sequencing was done using a 
30 377 Applied Biosystems (ABI) sequencer to obtain the novel nucleic acid sequaices. In 
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^casesRACE(Ra.6omAmpUf«a„ofcDNAEnds)«,.p«fonned»f»rt-«»»* 
the sequence in the 5' direction. 



5.2 EXAMPLE 2 
Novel Nuclcac Acids 



Novel INUCi eiC f^cma 

5 Thenovel.»cle,o»d.ofa»(^.in«.taof«»i,>v«tov«re»s^W 

15 3(IOiindpOTeiitideiitityg[Kiter(han95%. 

Usi»gfflRAP(Umv.ofW«hp««)»rCAW(Paracel),aM.c»s*P-^^^ 

in the Sequence Usting as SEQIDNOS: I- 438. 
j5 Tablelsho«sU»varioastiaa«es»n«s<>fSBQIDNal-438. 

Tl„„e.res,„ei,hbor«sulUtorpolypep«dese.cod=db,SEQIDNal-438 

C^anaS— aauba^suaingBIASTalgoHte. -""---'-f^-^ 
i,a«Se.^IiaU"g. Theh<»nologneawiU.iaenU8abtefunc«o»forSEQIDNO. 1 
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438 are shown in Table 2 below.Using eMatrix software package (Stanford University, 
Stanford. CA) (Wu et al.. J. Comp. Biol., Vol. 6 pp. 219-235 (1999) herein incorporated 
by reference), all the sequences were examined to determine whether they had 
identifiable signature regions. Table 3 shows the signature region found in the indicated 
polypeptide sequences, the description of the signature, the eMatrix p-value(s) and the 
pQsition(s) of the signature within the polypeptide sequence. 

Using the Pfam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 
26(1) pp. 320-322 (1998) herein incorporated by reference) polypeptides encoded by 
SEQ ID NO: 1-438 (i.e. SEQ ID NO: 1-438) were examined for domains with homology 
to c«tain peptide domains. Table 4 shows the name of the domain found, the 
description, the product of all the e-value of similar domains found, the pFam score for 
the identified domain within the sequence, number of similar domains found, and the 
position of the domain in the SEQ ID NO: being inteirorgated. 

The GeneAtlas™ software package (Molecular Simulations Inc. (MSI), San 
Diego, CA) was used to predict the three-dimensional structure models for the 
polypeptides encoded by SEQ ID NO: 1-438 (i.e. SEQ ID NO: 1-438). Models were 
generated by (1) PSI-BLAST which is a multiple aUgnment sequence profile-based 
searching developed by Altschul et al, (Nucl. Acids. Res. 25, 3389-3408 (1997)), (2) 
Hi^ Throughput Modeling (HTM) (Molecular Simulations Inc. (MSJ) San Diego, CA.) 
which is an automated sequence and structure seardiing procedure 
rhttp.//www.msi.com/> . and (3) SeqFold* which is a fold recognition method described 
by Rscher and Eisenbetg (J. Mol. Biol. 209, 779-791 (1998)). This analysis was carried 
out, in part, by comparing the polypeptides of the invention with the known NMR 
(nuclear magnetic resonance) and x-ray crystal three-dimensional structures as templates. 
Table 5 shows, "PDB ID", the Protein DataBase (PDB) identifier given to template 
stmcture; "Chain ID", identifier of the subcomponent of the PDB template structure; 
"Compound Information", information of the PDB template structiire and/or its 
subcomponents; "PDB Function Annotation" gives function of tfiePDB template as 
annotated by the PDB files nittp./www. resb.org/PDB/) ; start and end amino acid position 
of the protein sequence aligned; PSI-BLAST score, the verify score, tiie SeqFold score, 
and the Potential(s) of Mean Force (PMF). The verify score is produced by GeneAtlas™ 
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softwaxe (MSI), is based on Dr. Eisenberg's Profile-3D threading program developed in 
Dr. David Eisenberg's laboratory (US patent no. 5,436,850 and Luttiy, Bowie, and 
Eisenberg, Nature, 356:83-85 (1992)) and a publication by R. Sanchez and A. Sali, Proc. 
Natl. Acad. Sci. USA, 95:13597-12502. The verify score produced by GeneAtlas 
5 normalizes the verify score for proteins with different lengths so that a unified cutoff can 
be used to select good models as follows: 

Verify score (normalized) = (raw score - 1/2 high score)/(l/2 high score) 

10 The PFM score, produced by GeneAtlas™ software (MSI), is a composite scoring 

function that depends in part on the compactness of the model, sequence identity in the 
alignment used to build the model, pairwise and surface mean force potentials (MFP). As 
given in Table 5, a verify score between 0 to 1.0, with 1 being the best, represents a good 
model. Similarly, a PMF score between 0 to 1.0, with 1 being the best, represents a good 

15 model, A SeqFold™ score of more than 50 is considered significant. A good model may 
also be determined by one of skill in the art based all the information in Table 5 taken in 
totality. 

The nucleotide sequence within the sequences that codes for signal peptide 
sequences and their cleavage sites can be determined from using Neural Network Signal? 

20 Vl.l program (from Center for Biological Sequence Analysis, The Technical University 
of Denmark). The process for identifying prokaryotic and eukaryotic signal peptides and 
their cleavage sites are also disclosed by Henrik Nielson, Jacob Engelbrecht, Soren 
Brunak, and Gunnar von Hdjne in the publication " Identification of prokaryotic and 
eukaryotic signal peptides and prediction of tiieir cleavage sites" Protein Engineering, 

25 Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by reference. A maximum S score and 
a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 6 shows the position of the signal peptide in each of the 
polypeptides and the maximum score and mean score associated with that signal peptide. 
Table 7 correlates each of SEQ ID NO: 1-438 to a specific chromosomal location. 

30 Table 8 is a correlation table of the novel polynucleotide sequences SEQ ID NO: 

1-438, novel polypeptide sequences SEQ ID NO: 1-438, and their corresponding priority 
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nucleotide sequences in the priority application USSN 09/774,528, herein incoipOTated 
by reference in its entirety. 
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Tissue Origin | KNA/Tissue 
Source 



adult brain 



GIBCO 



adult brain 



GIBCO 



liibrary 
Naxoe 



Table 1 



SEQ ID NO: 



AB3001 



ABD003 



adult brain 



Clontech 



adult brain 



Clontech 



adult brain 



Clontech 



76-77 91 106^ 107 115 134 163-164 178 2DJ 
232 255 276 279 322-323 



16 19 24 77 80-81 85 89-90 92 96 98 105 
110 116 121-123 125 130-132 134-136 138 
142-143 151 153 158-159 163-164 184 191 
193 196 198 200 208-209 213-214 216 219- 
220 223 229 232-234 236 239 241 243 257- 
259 262 265 267 274-276 278 284 292 302 
317 321 324-325 327 337-338 340 348 359 
371 391-392 400 



ABROOl 



ABR0065 



1 18-19 35 80 98 125 13 6 153 185 200 209 
221 228-229 239 243 274-275 302 399-^400 



ABR008 



7-8 18 32 35 52 57 85 91 96 111 113 -26 
131 135 138-139 142 148 153-154 181 188 
192 199 209-211 217 221 224 226 229 233 
235 238 243 248 273 283-284 286 292 316 
322 348 357 361 367 376 378 399 407 409 

2^1 6-11 19-21 23-25 31 35-37 39-41 «-« 
72-73 76 80-81 85 88-90 94-95 97 102-105 
109 111-112 114-119 121-122 126-131 134- 
135 138-139 144 146-150 152-153 156-157 
159 168-172 174-175 178 180 182 185-186 
189-190 194 196 198-201 203 205-210 217 
219 221-222 224 229-230 232-233 236-239 
243-244 248 253-256 260-261 263-265 273 
276 281-282 286-289 291-292 299-300 302 
304 315-317 319 321-322 324 326 329 331- 
332 341 352-357 360 362 365 367-368 370 
376-377 379-380 383-384 387-389 391-392 
394 396-402 407-410 412-413 419 425-426 
433 



I adult brain 
adult brain 



Invitrogen 
Xnvitrogen 



adult brain 



Invitrogen 



I adult brain 



Invitrogen 



adipocytes Stratagene 



"adrenal gland Clontech 



adult heart 



GIBCO 



ABR014 
ABR015 



ABR016 



ABT004 



iU3P001 



ADR002 




9 >^ fl^ l^fi 200 233 282 3^?! 

14 31 69 121 124 163 2n Q 216 224 291 377 
92 136 219 279 



2 7-8 20-21 33 85 90-91 9b 97 102-103 l08 
121 123 129-131 138-139 143 146 151 153 
157-158 172 178 180 209-210 213 219 229- 
230 232 234 239 308 321 330 360 365 370- 
,^73 375 401 412 



3-4 23 36 79 81 106 
147 151 154 158 179 
256-257_287 292__291 



-107 116 129 133-134 
181 192 196 222 230 
313 329 359 



AHROOl 



2 25 27 33 57 76 85 
114 121-122 125 129 
180 182 198-199 201 
244 246 253-254 257 
329 336 352 403 



-86 88 96 98 105-108 
-130 134 147 164 178 
205 207-208 240-241 
261 276 280 292 320 



3 17-21 27 32 74 76 
105-110 117 121 124 
139 141 148 151-153 
182 186 190 193 198 
213 2 15 222 



85 89-91 95-96 102-103 
-125 128 131 134-136 
155-156 161 163 181- 
200-201 205 207 211- 
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Tissue Origin 



adult kidney 



adult lung 



GIBCO 



Table 1 

SEQ ID NO: 

225 229-230 234 251-254 2bV-2b9 ^b3 274- 
277 280 292-297 301 303-304 315-316 319 

,oQ_33i 345 359 ^ ft4 417 423-424 _ 

3 6 14 20-21 25-26 76 79 »b 89 94 101 111 
111 ns 121 124 126 130-131 138 146 163 

110 ^77-178 189 196 198 201 204 213.231 
25?-254 25I259 271 273-275 277 298 315 

r29'74 7^85 90 96 10b Hi Ui^ X 32^ 
i36 142 144 149 159 181 189 198 200 205- 

111 226 255 257 263 283 294 300 302-303 
328 358-359 365 426 



young liver 



adult liver 



GIBCO 



Invi trogen ALVO 02 



adult liver 
Ovary 



-T n 11 25-26 29 31 33 76 8b 9b lib 121-122 
124 130 143 146 156 158 164 178 182 
^87 189 229 248 253-254 261 278 283 304 

- 1 - ^1^23 26 31 33-34 38 bJ 56 90-92 94-9b 
i?8 Lf 124 128-129 138 141 146 148 153 
i" m 178 198 216 232 248 253 254 
?.B6-257 264 302 306 365 375 383 396 

76 81-82 85 89-91 94-98 104-109 111 
IlS-Jie 121-128 130-131 134 136 138-139 
141 143-144 146 149-150 152 155 157-160 

166 170-173 175 177-178 180 182 184- 
il ' JlS; ^93-194 196-197 200-201 212- 
913 215 217 222 225-226 228 230-233 235 
2il-2i3 245 248 253-259 261 266-267 270 
272-273 276-278 283-285 287 289 292 297- 
299 3J5-3O6 315-317 319 323-325 329-331 
1 lA 343-344 352 358-359 363-366 382-383 
:^86 389-390 412 



Placenta 



^3 92 n7 135 182 194 232 2 4 6-263-272 282 



adult spleen j GIBCO 



131 134 136 139 151 178 181 189 194 200 
, 210 lit lis 251 253-255 257 276 283 307- 
^09 315 329 3^A-.^55 357 392 400 ' 



testis 

bladder 
bone marrow 



GIBCO 



ATSOOl 



Invitrogen BLDOOl 
Clontech BMDOOl 



if7tirtr^6l97-r04-105 "4 130 
164 173 200 209 222 233 241 253-254 257 

285 --fiT ^"'^ ^^'^ 329 351-353 359 

! 108 130 150 212 226 236 240 242 2^1 ^ 
287 305 395-396 415 



bone marrow 



BMD002 



1 4-5 22 29-30 34 72 85 88 90 9-^^« 
104-107 109 111 113 117 120 123-125 128- 
i« 132 135 140 142 144 146 152 163 165- 
^70-173 177 180 182 186 189-190 198- 
2" 2I5 222 225 232 240-246 251-252 260- 
2ei 273-275 277-280 283-285 300 316 318 

fr Wltll 16 19 25 31 49 61-6. uj 

80 85 88 90 93-95 97-101 109-110 112 
HA 116-117 J^>1 126 129 132 135 141 144 . 
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Table 1 



Tissue Origin 



BNH/Txssue 
Source 



Zfibrary 
Name 



SEQ ID NO: 



146 149- 
170-172 
194 198- 
234 242 
273 276- 
299 302 
356-357 
434 



150 154 

175 178 

200 203 

245 247 

278 280 

307 309 

359 367 



157 160 
-180 182 
208 210 
251-254 
285 287 
315 322 
369 388 



162-163 
-183 186- 
-213 215 
256-257 
289 291 
324 337 
407 414 



165-166 

-190 192- 
223 225 
265 270 
293-294 
-338 353 
419 426 



bone marrow 



Clonetech 



BMD007 



144 



♦Mixture of 
16 tissues - 
mRNA 



VARIOUS 
VENDORS 



CGdOlO 



1 34-35 95 152 161 171 182 206 219 242 260 
267 276 280 288 297 300 315-316 412 



♦Mixture of 
16 tissues - 
mRNA 



VcLrious 
Vendors 



CGdOll 



45 51 167 188 216 251-252 



♦Mixture of 
16 tissues - 
mRNA 



Various 
Vendors 



CGd012 



2 10-11 
50-52 69 
117 120 
158 163 
201 208 
239 241 
279 281 
327-328 
361 365 
402 406 



18-21 29 31 
-71 87-89 94 
123 125 127 
165-169 175 
216 219-221 
•246 251-252 
283-284 287 
331 333-334 
369 379-380 
410-412 417 



34-35 40 42 
-95 98-105 
131 135-136 
180 187-188 
224 226 234 
260 264 270 
295-296 314 
337-341 343 
387 389 395 
419 424 426 



43 45 48 
109 111-113 
138 146 
191 198 
236 238- 
276-277 
319 321 
351-352 
397-399 
431-433 



♦Mixture of 
16 tissues - 
mRNA 



Various 
Vendors 



CGd013 



29 48 101 146 167-169 187 219 234 327 333 
339 341 365 412 433 



♦Mixture of 
16 tissues - 
mRNA 



Various 
Vendors 



CGd015 



29 86 90 95 98 110 113 118 132 158 171 184 
193 218-220 243 284 310 385 410 419 



♦Mixture of 
16 tissues - 
mRNA 



Various 
Vendors 



CGd016 



3-4 20-21 29 38 85 88-89 95 105 119 122 
131-133 140 185 211-212 225 256-257 273 
276 302 318 379-380 390 400 419 



colon 



Invitrogen 



diNOOl 



4 25 33 85 138 146 148 158-159 198 210 229 
301 360 384 397 



cervix 



BioChain 



CVXOOl 



3 5 10-11 18 20-21 24-25 29 36 41 47 57 63 
72 74 76 86 90 94 104 108-109 111 125 127 
130 134 138 144 147 162 174 178-179 182 
186 189 193 197 211 222 225-226 228 232 
241 243 257 261 267 270 273-275 278-281 
288-289 298 301-302 305 315 319 324-325 
329 331 337-338 359 391-392 395 420 



endothelial 
cells 



Strategene 



EDTOOl 



3-6 18-19 24 27 
98 104-107 111 
138-139 141 144 
166-167 170-173 
191 193-194 196 
226 231-232 236 
258-259 276 279 
315 329 337-338 



-29 35 72 76 79-80 85 89 96 
117 119-121 124-131 134 136 
146-147 149 152 158-159 
178-179 182-183 186-187 
-197 200 210-211 222-224 
241 243 246 248 253-256 
282 287 292 300 302-303 
358-362 382-383 385-388 



esophagus 



BioChain 



ESO002 



257 



fetal brain 



Clontech 



FBROOl 



34 



fetal brain 



Clontech 



FBR004 



3 139 144 271 284 337-338 



fetal brain 



Clontech 



FBR006 



4 6-11 14 18-21 24 28 31 37-38 40 63 76 85 
87 89-90 94-95 97 105 108-109 112-113 115 



125 



wo 02/081731 



PCT/US02/01222 



Table 1 



Tissue Origin 



RNft/Tissue 
Sourch 



Zfibrary SEQ ZD NO: 



117-120 
170 172 
199 201 
232-233 
281 288- 
330-331 
380 383 
419 421 



127-130 
175 180 
203 209 
240 243 
289 292 
356-357 
389 397 
423 



133 138 
182 186- 
-210 215 
245 253- 
295 304 
359-360 
399-401 



140 144 
188 190 
219 222 
255 270 
315 317 
364 367 
408-409 



-146 148 
192 194 
229-230 
273 276 
319 324 

-368 379- 
411 413 



£etal brain 



Invitrogen 



FBT002 



2 14 19 23 28 31 90 94 105 121 124 126 131 

135 139 142 149 158 186 193 198 210 214- 

215 232 239 242 248 255 267 326 332 365 

369 371 376-383 394 399 



fetal heart 



Invitrogen 



FHROOl 



4 7-8 10-11 14 17-21 28-29 31-32 60 64-65 
73 85 87 92 95 102-103 105 108 111 113 117 
119 121 125 128-129 134-135 141 152 154 
156-157 160-161 172 176 178 194 196 198- 
200 203 208 212 215 218 222 226 229 233- 
234 253-257 261 265 272 276 281 292-293 
295 303 305 319 325 327 337-338 341 345 
349 354-355 367-368 389 395-396 398 412 
417 436 



fetal kidney 



Clontech 



FKDOOl 



1 14 22 94 110 115 132 134-135 146 178 189 
199 235-236 242 247 257 267 292 295 359 



fetal kidney 



Clontech 



FKD002 



22 31 38 40 46 94 122 127 131 156 160 194 
198 229 253-254 270 292 303 319 354-355 
389 396 



fetal kidney 



Invitrogen 



FKD007 



303 



fetal limg 



Clontech 



FLGOOl 



85 89 98-100 111 175 271 281 369 



fetal lung 



Znvatrogen 



FI1GOO3 



84 88 106-107 122 135 140 146 160 181 246 
272 284 292 328 330 396 404 416 426 



fetal liver- 
spleen 



Scares 



FLSOOl 



1-3 6-12 14 19 23 28-31 33 57 59-60 72-76 
78 80 83 85-138 140-141 143-144 146-155 
157-161 163-197 200 204 208 210-211 223 
225 230 232-233 235 241-243 245-266 268- 
273 277 281 285-287 292 297 303 314 329 
343 346-347 357-359 369 397 399 407 415 



fetal liver- 
spleen 



Soares 



FLS002 



1 3-4 6 
74-76 79 
109 111- 
129 132 
159 162- 
191 194 
232-233 
259 261 
277-278 
322 330- 
365 388 



10-12 23-24 29 

81-82 86-89 91 
112 115 117-120 
134 136-138 141 
166 170 172 175 
196-197 205 207 
239-241 248 251 
262 264 266-267 
283 285 287 298 
332 337-338 341 
390-391 399 402 



31-33 35 
94-95 99 
122 125 
146 149 
178-180 
212 222-; 
-252 255- 
270-271 
305 315 
343 349 
418 424 



37 53-54 
-104 106- 
126 128- 
153 157- 
183 185- 
225 228 
256 258- 
273-275 
317-318 
357-360 



fetal liver- 
spleen 



Soares 



FLS003 



12 29 91 98 111 119 156 163 165 178 186 
193 210-211 276 286 315 322 346-347 357 
365 424 



fetal liver 



Invitrogen 



FLVOOl 



7-8 14 35 118 122-123 129 146 182 211 230 
232 248 251-252 264 287 304 337-338 344 
346-347 352 365 367-369 



fetal liver 



Clontech 



FLV002 



102-103 147 149 300 



fetal liver 



Clontech 



FLV004 



73 85 105 108 118 122 126 141 156-157 161 
165 170 178 180 182 194 215 218 225 240 
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Table 1 



Tissue Origin 



lUIA/Tissue 
Source 



Library 



SEQ ID NO: 



242 247 251-252 292 330 337-338 369 407 
411 440 ^ 



fetal muscle 



Invitrogen 



FMS002 



5 9 17-18 20-21 29 38 85 88 97 106-107 129 

131 136 150-152 155 165 170 179 182 192- 

193 212-213 229 234 242 258-259 270 282 

286 289 300 316 319 345 351 354-355 360 

389 396 408 410 437 439 



fetal skin 



Invitrogen 



FSKOOl 



2 4 7-8 29 33 42-43 49 51-52 58 74 82 85 
90 94 ilO-111 116 118 121 133 136 138-139 
145 151 154 156-157 161-162 172 181 184 
186 193 198 200 205 207 209-211 222 227- 
230 232 235 240 246 253-257 266 270 276 
292 295 299 316 318 323 330 332 337-340 
343 357 369 389 394-395 412 422 427 



fetal skin 



Invitrogen 



FSK002 



4 9 42 44 51 66 72 81 85 89-90 95 98 105 
112-114 119 121 129 133 135 162 172 179- 
182 197 200 208 210 231 243-244 272 304 
316 330 339 354-355 357 360 389 395 410 

417 437 



fetal spleen 



BioChain 



FSPOOl 



157 223 



luribilical 
cord 



BioChain 



FUCOOl 



4-6 20-21 25 29 73-74 83 87 89-91 94 101 

109 120 123 125 128 130-131 133 141 143- 

144 147 149 154 161 165 173 175 179 184 

188 210-212 217 226 235 240 248 251-252 

257 262 267 270 277 293 305 307 316 319 

323 327 331 341 356 359 389 392 407 416 



fetal brain 



6IBC0 



HFBOOl 



2-4 16 20-21 74 
111 114 118 121- 
134 137-140 142 
159 163-164 166 
196 200 203 209 
239 243 253-255 
292 310 316 319 
399 



77 85 89-91 
.122 124-125 

144 146-148 

173 178 180 
-214 216-232 

263 270 272 
'321 332 348 



96-98 104-105 
127-128 131 
151 153 158- 
182 191 194 
234-236 238- 
-273 276 281 
357 359 365 



macrophage 



Invitrogen 



HMPOOl 



2 247 



infant brain 



Scares 



IB2002 



2-4 7-8 
89 91 96 
122 125 
172-173 
203 208- 
234 236 
273-275 
317 322 
368 376 



19-22 26 

98 106 
128-131 
177 180 
210 217 
237 239 
278-279 
327 330 
379-380 



-27 31-32 35 
107 110 112 
134-144 148 
186-187 191- 
219 223-224 
241-243 245 
282 287.294 
333-334 341 
382 396 406 



73-74 80 85 
118-119 121- 
153 164 166 
194 196 202- 
227 229 232- 
248 253-259 
298 309 314 
348-350 360 
424 



infant brain 



Scares 



IB2003 



3-4 20-21 26 28 
119 122-123 130 
146 153 155 170 
209 219 223 226 
248 253-254 256- 
314 337-338 343 



31 35 73 85 
131 135 138 
172-173 186 

229 233-234 
257 273 279 
359 367 371 



95-96 110 113 
140 142-143 
191-193 196 

236 239 245 
291-292 304 
376 397 413 



l\ing, 

fibroblast 



Strategene 



LFBOOl 



3 6 31 72-73 90 92 105-107 124 126-127 133 
136 139 144 146 172 189 198 204 233 235 
246 258-259 268 272 276 282 310 335 359 

434 



adult lung 



Invitrogen 



LGT002 



4 19-21 28 33 35-36 49 72 79 81 85 88 90- 
91 94-95 101 106-107 109 118 120-125 127 



127 
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Table 1 



Tissue Origin 



RlUk/Tissue 
Source 



Xiibrary 



SEQ ID NO: 



130-131 133 
157 159-161 
197 212 216 
233 241 247 
270-275 277 
315 318 324 
381 392-393 



135-138 
163 166 
218 221 
-248 253 
-278 282 
331 335 
398 



141-142 
170-173 
223 226 
-255 257 
-283 292 
354-355 



144 147 149 
193-194 196- 
228-229 231 
261 266-267 
298 301 303 
359 367 369 



leukocytes 



GIBCO 



LUCOOl 



1-5 15 19-21 28 30-33 37 72 74 91 94-95 
97-100 108-109 113 115 117 119-122 124-125 
127-128 134-138 141 144 146-148 150-151 
157-158160 162-167 170-173 175-178 180-181 
187 189 192 194 197 200 212-213 215-216 
218-219 223 225 228-232 241-242 245-246 
251-254 261 272-276 278-282 284 2.87-290 
297-298 305 307 310-314 325 331 336 340 
358-359 372 399 414 



leukocytes 



Clontech 



LUC003 



1 5 124 171 176 204 225 248 253-254 283 

285 307 315 



melanoma 



Clontech 



MEL004 



4-5 24 37 72-74 81 85 106-107 113 136 177 
203 205-207 209 231 243 284-285 315-316 
320 326 359 374 428 



mammary gland 



Invitrogen 



imGOOl 



2 4-5 7-8 10-12 29 31 34-35 38 50 80-81 85 
89-90 92 94-97 105 108-109 119-124 126 
128-130 135 138-139 141-142 144 146-147 
153 155 157-159 163 178-179 181-182 198 
200 209-210 219 223 228 230 232-233 235- 
236 239 242 248 253-255 257 260-261 265- 
267 270 272 281 287 292 294 315-316 318 
324 327 330 337-340 354-355 357 369 372 
383 392-395 401 404 



neuron 



Strategene 



NTDOOl 



35 47 89-90 111 118 164 232 253-254 276 
324 331 382 ' 



neuron 



Strategene 



NTROOl 



20-21 37 122 147-149 170 179 181 186 212 
226 258-259 265 276 369 436 438 



neuronal 
cells 



Strategene 



NTUOOl 



7-8 37 55 80 85 112 118 126-127 133 138 
140-141 151 170 181 210 214 225-226 236 
243 287 328 330-331 357 383 400 436 



pituitary 
gland 



Clontech 



PIT004 



92 124 159 231 



placenta 



Clontech 



PIiA003 



34 46 88 126 128 159 182 186 197 201 267 
278 281-282 305 330 356 361 365 418 



prostate 



Clontech 



PRTOOl 



18 36 72 74 86 95 106-107 111 118 122 144 
161 179 211 218 233 286 297 



rectum 



Invitrogen 



RECOOl 



9 31 85 121 128 147 171 200 219 257 292 
340 394 398 407 412 



salivary 
gland 



Clontech 



SALOOl 



3 24 38 80 122 136 147 189 241 282 296 310 



351 392 395 415 



saliva gland 



Clontech 



SALS03 



118 



small 
intestine 



Clontech 



SINOOl 



12 16 25 82-83 89-90 93 95 98 105-109 111 
122-123 125-128 133-134 137 139 142 161 
167 171 184 197 201 204 212 218 236 242- 
243 248-249 253-254 257 267 276 284-285 
292 297 300 303 310 313 317-318 325 340 
343 352 354-355 359 383 391 416 



spinal cord 



Clontech 



SPCOOl 



3 39 84 86 94 96 105 115 117 130-131 134 
136 141 143 148 155 176 190-191 203 213 
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Tissue Origin 


RNA/Tissue 


Library 


SEQ 


ID NO: 


















Source 


Name 




























224 


233- 


234 


236 


239 


279 


283 


298 


320- 


•321 








332 


336- 


338 


356 


359 


365 


404- 


-406 






thalamus 


Clontech 


THA002 


2 20-21 


23 74 81 85 


105- 


-106 


116 


121 


131 






146 


171 


185 


188 


200 


209 


219 


233 


239 


256 








258- 


-259 


273 


276 


362 


399 










thymus 


Clonetech 


THMOOl 


16 29 33 57 


80 82 85 90 


93-94 106-107 120 






126 


128 


134 


141 


161 


176 


194 


223 


228 


235 








253- 


-254 


261 


274- 


-275 


278 


285 


298 


319 


332 








336 


343 


353 


359 


425 












thymus 


Clontech 


THMc02 


1-2 


7-9 


14 26 34 44 


73 75 82 85 


87 94 98 






106- 


-107 


109- 


-111 


117 


119- 


-120 


125- 


-126 


128- 








129 


139 


141 


144 


147- 


-148 


151 


154- 


-155 


162 








165 


170- 


-172 


175- 


-176 


179 


182 


186 


193- 


-194 








199- 


-200 


208 


-209 


213 


218 


233 


235 


240 


242 








247 


253- 


-254 


257 


265 


276 


281 


287 


290 


305 








307 


312 


319 


336 


342 


354- 


-356 


359 


364 


367 








399 


408 


412 


-413 


415 


419 


421 


426 


429- 


-433 


thyroid gland 


Clontech 




3 5 


7-8 


28 


30-31 33 


73-77 80 82 


85 88 90- 






92 94 96-98 


105- 


-107 


109 


113 


117 


121- 


-122 








124- 


-125 


127 


-128 


130 


134 


136 


141 


143 


146- 








148 


152 


161 


-163 


166 


175 


177- 


-178 


181 


194 








199 


201 


204 


210 


212 


216 


218 


223 


-226 


228 








230 


-231 


234 


236 


241 


243 


246 


253 


-257 


261 








270 


272 


-273 


276 


-278 


281 


-283 


287 


292 


295 








298 


303 


-304 


308 


315 


323 


329 


335 


352 


359 








362 


401 


416 


-417 














trachea 


Clontech 


TRCOOl 


RR 


138 


180 


226 . 


228 


279 


359 


411 


436 




uterus 


Clontech 


OTROOl 


3 10-11 


23 


11 92 106-107 109 111 141 197- 






198 


218 


241 


257 


270 


274 


-275 


302 


315 


329 








396 


400 


413 

















*The 16 tissue/mRNAs and their vendor sources are as follows: 1) Nonoal adult brain 
mRNA (InvitrogBn), 2) Normal adult kidney mRNA (Invitrogen), 3) Normal fetal brain mRNA 
(Ihvitrogen), 4) Nonnal adult liver mRNA (Invitrogen), 5) Normal fetal kidney mRNA 
Onvitrogen), 6) Normal fetal liver mRNA (Invitrogen), 7) normal fetal skin mRNA (bivitrogen), 8) 
human adrenal gland mRNA (Qontech), 9) Human bone marrow mRNA (Clontech), 10) Human 
leukemia lymphoblastic mRNA (Clontech), 11) Human,thymus mRNA (Clontech), 12) human 
lymph node mRNA (Clontech), 13) human soNspinal cord mRNA (Qontech), 14) human thyroid 
mRNA (Clontech), 15) human esophagus noRNA (BioChain), 16) human concq)tional umbiUcal 
OOC& mRNA (BioChain). 
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Accession No. 


Species 


Description 


Score 


% 
Identity 


1 
1 


gl7o3/lx3 


HoiDO sapiens 


: ; . 

membrane-associated nucleic acid 
binding protein mRNA, partial cds. 




34 


1 


gi7020305 


Homo sapiens 


cDNA FU20301 fis, clone HEP06569. 


1728 


47 


1 


gi7294120 


Drosophila 
xnelanogaster 


CG16807 gene product 


1535 


53 


2 


AAY57911 


Homo sapiens 


Human transmembrane protein 
HTMPN-35. 


1258 


82 


2 


AAB88406 


Homo sapiens 


Human membrane or secretoiy protein 
clone PSEC0162. 


265 


39 


2 


gil4272664 


Homo sapiens 


unnamed protein product 


265 


39 


3 


gil2654575 


Homo sapiens 


Similar to gp25L2 protein, clone 
MGC:2142 IMAGE:2967520, mRNA, 
con^lete cds. 


1116 


100 


3 


gil2845568 


Mus musculus 


putative 


1099 


98 


3 


gi996057 


Homo sapiens 


H.sapiens mRNA for gp25L2 protein. 


1096 


98 


4 


gi9971050 


Homo sapiens 


Human DNA sequence from clone 
RPl 1-526K24 on chromosome 20. 
Contains a novel gene, the 5' end of a 
novel gene, two CpG islands, ESTs, 
GSSs and STSs, complete sequence. 


4348 


99 


4 


AAB95086 


Homo sapiens 


Human protein sequence SEQ ID 
NO:16999. 


3034 


99 


4 


gil0433753 


Homo sapiens 


cDNA FU12307 fis, clone 
MAMMA1001908. 


3034 


99 


5 


gi4689106 


Homo sapiens 


NADH-ubiquinone oxidoreductase B$ 
subunit 


505 


100 


5 


gi2909862 


Homo sapiens 


NADH-ubiquinone oxidoreductase 
subunit CI-B8 mRNA, complete cds. 


505 


100 


5 


gil2539408 


Homo sapiens 


NDUF A2 gene for NADH 
dehydrogenase (ubiquinone) 1 alpha 
subcomplex 2, complete cds. 


505 


100 


6 


AAG64416 


Homo sapiens 


Human nucleoprotein. 


3765 


100 


6 


gil0443046 


Homo sapiens 


Human DNA sequence from clone 
RPl 1-465L10 on chromosome 20. 
Contains 10 CpG islands, ESTs, STSs 
and GSSs. Contains the gene for a 
novel protein similar to Drosophila 
CGI 1399, the gene for a novel C2H2 
type zinc finger protein similar to 
chicken FZF-1, a Femtm bght 
polypeptide (FTL) pseudogene, the 
MMP9 gene for matrix 
metalloproteinase 9 (gelatinase 6» 
92kD gelatinase, 92kD type IV 
collagenase) (CLG4B), a novel gene, 
the SLC12A5 gene for solute carrier 
family 12, (potassium-chloride 
transporter) meniber 5 (KIAA1176) 
and ±Q y end of gene KIAA1637, 
complete sequence. 


3765 


100 


6 


gil5426514 


Homo sapiens 


clone MGC: 1 6205 IMAGE:3640928, 
mRNA, complete cds. 


3765 


100 


7 


AAG64416 


Homo sapiens 


Human nucleoprotein. 


3366 


100 
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SEQIDNO: 1 > 


/Vccession No. 


Species 


Description 


Score 

I 


% 
denOty 


5il0443046 1 


iomo sapiens 1 
1 

( 


Juman DNA sequence from clone - 
ypi 1-465L10 on chromosome 20. 
Contains 10 CpG islands, ESTs, STSs 
and GSSs. Contains the gene for a 
aovel protein similar to Drosophila 
CGI 1399, the gene for a novel C2H2 
type zinc finger protein similar to 
chicken FZF-1, a Ferritin Ugbt 
polypeptide (FTL) pseudogene, Ae 
"mtvjtpo apTie for matrix 
metalloproteinase 9 (gelatinase B, 
92kD gelatinase, 92kD type IV 
coUagenase) (CLG4B), a novel gene, 
the SLC12A5 gene for solute carrier 
family 12, (potassiumrchloride 
transporter) member 5 (KIAAl 176) 
and the 3' end of gene KIAAl o3 /, 
complete sequence. ^ 


J366 ] 


100 


i 


• 


100 


7 


gil5426514 


Homo sapiens 


clone MGC:16205 IMAGE:36409Z8, 
mRNA, complete cds. 




85 


8 


gil4571904 


Rattus 


lysosomal amino acid transporter 1 


2145 




1 


AAE04910 


norvegicus J 
Homo sapiens 


Human transporter and ion channel-23 
fTRICH'23) protein. , 


1239 


"56 
"43 


1 


gi7297404 


Drosophila 
melanogaster 


CG13384 gene product 


"837 


98 


9 


AAB73686 


Homo sapiens 


Human oxidoreductase protein ORP- 
19. . 


1301 

~m 


"59 


9 


gi7291405 


Drosopbila 

melanogaster 


T3dh gene product 




52 


9 


gi5824752 


Caenoibabditis 
elegans 


predicted using Gencfinder-contams 
similarity to Pfam domain: PF00465 
(Iron-containing alcohol 
dehydrogenases), Score=177.7, E- 
value=1.9e-50, N=2'-cDNA EST 
EMBL:Z14517 comes from this gene; 
cDNA EST ykl8d4.3 comes from this 
gene-cDNA EST ykl8d4.5 comes 
from this gene; cDNA EST ykl 16f5.5 

comes from this gene-cDNA EST 
1 ykl32h3.3 comes from this gene; 
cDNA EST yk73dl0.3 comes from ^s 
gene-^DNA EST yk93e9.3 comes from 
this gene; cDNA EST ykl32h3.5 
comes from fliis gene-cDNA EST 
yk73dl0.5 comes from this gene; 
cDNA EST yk93e9.5 comes from this 
gene~cDNAESTykl35b6.5 comes 
from this gene; cDNA EST ykl35b6.3 
comes from this gene-cDNA EST 
yk201e5.3 comes from this gene; 
cDNA EST yk268bl3 comes from this 
1 pene-cDNAEST^61d63wmes_ 


685 
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Accession No. 


Species 


Description 


Score 


% 
Identity 








from this gene; cDNA EST yk262hl 1.3 
comes from this gene^cDNA EST 
yk292hl 1.3 comes from this gene; 
cDNA EST yk304d8.3 comes from this 
gene-cDNA EST yk344b7.3 comes 
from this gene; cDNA EST yk35 la6.3 
comes from this gene-cDNA EST 
yk366d9.3 comes from this gene; 
cDNA EST yk368e3.3 comes from this 
gene-cDNA EST yk372cl 1.3 comes 
from this gene; cDNA EST yk389g3.3 
comes from this gene~cDNA EST 
yk422d2.3 comes from this gene; 
cDNA EST yk381d7.3 comes from this 
gene-cDNA EST yk201e5.5 comes 
from this gene; cDNA EST yk267f6.5 
comes from this gene~cDNA EST 
yk268bl.5 comes from this gene; 
cDNA EST yk26ld6.5 comes from this 
gene-cDNA EST yk262hl 1.5 comes 
from this gene; cDNA EST yk292hl 1.5 
comes from this gene~cDNA EST 
yk304d8.5 comes from this gene; 
cDNA EST yk344b7.5 comes from this 
gene^cDNA EST yk368e3.5 comes 
from this gene; cDNA EST yk372cl 1.5 
comes from this gene^UJM A hb 1 
yk35 la6.5 comes from this gene; 
cDN A hh I ykioody. J comes irom cms 
gene-cDNA EST yk389g3.5 comes 
from this gene; cDNA EST yk422d2.5 
comes from this gene-cDNA EST 
yk560f43 comes from this gene; 
cDNA EST yk625h5.3 comes from this 
gene-cDNA EST yk381d7.5 comes 
from this gene; cDNA EST yk560f4.5 
comes from this gene~cDNA EST 
yk625h5.5 comes from this gene 






10 


AAB73686 


Homo sapiens 


Human oxidoreductase protein ORP- 
19. 


1552 


99 


1 A 




urosopmia 
melanogaster 


1 JKIU gCilC piUUUV/L 


891 


56 


10 


gi5824752 


Caenorhabditis 
elegans 


predicted using Genefinder-contains • 
similarity to Pfam domain: PF00465 
(Iron-containing alcohol 
dehydrogenases), Score=177,7, E- 
valuc=1.9e-50, N=2-cDNA EST 
EMBL:Z14517 comes from this gene; 
cDNA EST ykl8d4.3 comes from tins 
gene-cDNA EST ykl8d4.5 comes 
from this gene; cDNA EST ykl 16£5.5 
comes from this gene^DNA EST 
ykl32h3.3 comes from this gene; 
cDNA EST yk73dl0.3 comes from this 


730 


51 
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SEQIDNO: 


Accession No. 


Species 


Description 


Score 


% 
Identity 








gene--cDNA EST yk93e9.3 comes from 
this gene; cDNA EST ykl321i3.5 
comes from this gen6-K:DNA EST 
yk73dl0.5 comes from fliis gene; 
cDNA EST yk93e9.5 comes from this 
gene-cDNA EST ykl35b6.5 comes 
from this gene; cDNA EST ykl35b6.3 
comes from this gene~cDNA EST 
yk201e5.3 comes from this gene; 
cDNA EST yk268bl .3 comes from this 
genc-cDNA EST yk261d6.3 comes 
from this gene; cDNA EST yk262hll3 
comes from this geneM:DNA EST 
yk292hl 1 .3 comes from this gene; 
cDNA EST yk304d8.3 comes from this 
gene-cDNA EST yk344b7.3 comes 
from this gene; cDNA EST yk351a6.3 
comes, from this gene-cDNA EST 
yk366d9.3 comes from tbis gene; 
cDNA EST yk368e3.3 comes from this 
gene-cDNA EST yk372cl 1 .3 comes 
from this gene; cDNA EST yk389g3,3 
comes from this gene-cDNA EST 
yk422d2.3 comes from this gene; 
cDNA EST yk38 ld7.3 comes from this 
graie-cDNA EST yk201e5.5 comes 
from this gene; cDNA EST yk267f6.5 
comes from diis gene-cDNA EST 
yk268bl.5 comes from this gene; 
cDNA ESTyk261d6.5 comes from this 
gene^DNA EST yk262hl 1.5 comes 
from this gene; cDNA EST yk292hl 1.5 
comes from this gene~cDNA EST 
yk304d8.5 comes from tiiis gene; 
cDNA EST yk344b7.5 comes from this 

^^^^ «<T^XT A «r1#-^ iron's C 1. n .1 1 .. ■■ 

gene~clJXSA hbi yK3ooe3.3 comes 
trom tnis gene; cliina t>o i yiu /zci i .j 
comes jiuiu uus ffsuGr^xJisj\ x 
yk351a6.5 comes from this gene; 
CUJNA CtOi yMLDOvuy.D comes iromiiiis 
genercDNA EST yk389g3.5 comes 
from this gene; cDNA EST yk422d2.5 
comes from tiiis gene~cDNA EST 
yk560f4.3 comes from tiliis gene; 
cDNA EST yk625h5.3 comes from fbis 
gene--cDNA EST yk381d7.5 comes 
from this gene; cDNA EST yk560f4.5 
comes from this gene~cDNA EST 
yk625h5.5 comes from^ gene 






11 


AAB85166 


Homo sapiens 


Hmnan Bcl-Gl polypeptide. 


1598 


87 


11 


gil4598300 


Homo sapiens 


unnamed protein product 


1598 


87 


11 


gil2584085 


Homo sapiens 


apoptosis regulator BCL-G long form 
(BCLG) mRNA. conojlete cds. 


1598 


87 


12 


fiil5077865 


Mus musculus 


bullous pemphigoid antigen 1-b 


1253 


82 
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SEQ ID NO: 


Accession No. 


Species 


Description 


Score 


0/ 
V9 

Identity 


12 


gil5077863 


Mus musculus 


bullous peiii^)higoid antigen 1-a 


1253 


82 


12 


gi6624582 


Homo sapiens 


Human DNA sequence from clone 
RP1-61B2 on chromosome 6pl 1.2-12.3 
Contams isoforms 1 and 3 of BPAGl 
(bullous pemphigoid antigen 1 
(230/240kD), an exon of a gene smoilar 
to murine MACF cytoskeletal protein, 
STSs and GSSs, complete sequence. 


733 


99 


13 


gi3702270 


Homo sapiens 


chromosome 19, cosmid R31408, 
complete sequence. 


887 


93 


13 


81401845 


Homo s£^iens 


ribosomal protein LI 8a niRNA, 
complete cds. 


887 


93 


13 


gil3960144 


Homo sapiens 


ribosomal protein LI 8a, clone 
MGC:4476 IMAGEr2961519, mRNA, 
complete cds. 


887 


93 


14 


AAB59090 


Homo sapiens 


Breast and ovarian cancer associated 
antigen protein sequence SEQ ID 798. 


496 


OA 

80 


14 


AAB44129 


Homo sapiens 


Human cancer associated protein 
sequence cM\i J-U iNu:iD/«t. 


453 




14 


gil4198321 


Mus musculus 


ribosomal protein L3 1 


453 


81 


15 


gi56894o5 


Homo sapiens 


mKNA tor JUAA1U04 pTOtem, paraai 
cds. 




lUU 


1 c 

15 


gl4oo4ioo 


Homo sapiens 


mKJNA, Cl^JNA JJJvr^pDoOl^lxZU 

(from clone DKFZp586L1220); partial 
cds. 


1 AOS 


inn 


15 


gil3161145 


Homo sapiens 


zinc finger protein mRNA, con9>lete 
cds. 


369 


36 


10 




Mus musculus 


SKm*j3Url 




0^ 

y* 


16 


fii5870834 


Mus musculus 


skm-BOP2 


2397 


91 


16 


gll 809322 


Mus musculus 


t-rJOF 




07 

yj 


17 


gil3938126 


Mus musculus 


RKEN cDNA 3732409005 gene 


2678 


98 


17 


gil2852375 


Mus musculus 


putative 


2678 


98 


17 


gi7024433 


Torpedo 
mannorata 


male sterility protein 2-like protein 


2307 


OA 

oO 


18 


AAB95482 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 18007. 


1572 


67 


18 


gil4042809 


Homo salens 


cDNA FU14932 fis, clone 
PLACE1009639. 


1572 


67 


18 


gil2053165 


Homo sapiens 


mRNA; cDNA DKFZp434K0427 
^irom Clone jjjsxzip*^ jhjvuhz 
complete cds. 


1572 


67 


19 


gi7243159 


Homo sa|)iens 


mRNA for KIAA1389 protein, partial 
cds. 


7842 


99 


19 


gi4151328 


Homo sapiens 


high-risk human papilloma viruses E6 
oncoproteins targeted protein E6TP1 
alpha mRNA, complete cds. 


nil 


53 


19 


gi4151330 


Homo sapiens 


high-rislc human papilloma viruses £6 
oncoproteins targeted protein E6TP1 
beta mRNA, complete cds. 


3m 


53 


20 


gi7243159 


Homo sapiens 


mRNA for KIAA1389 protein, partial 
cds. 


7714 


98 


20 


gi4151328 


Homo sapiens 


high-risk human papilloma viruses E6 
oncoproteins targeted protein E6TP1 


3806 


54 
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Score 


% 
Identity 








alpha mRNA, complete cds. 






20 


gi4151330 


Homo sapiens 


high-risk human papilloma viruses E6 
oncoproteins targeted protein E6TP1 
beta mRNA, complete cds. 


3797 


53 


21 


AAB95328 


Homo sapiens 


Human protein sequence S£Q ID 
NO:17595. 


753 


61 


21 


AAB93757 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 13432. 


753 


61 


21 


AAB29657 


Homo sapiens 


Human menibrane-associated protein 
HUMAP-14. 


753 


61 


22 


gi7673373 


Homo sapiens 


SCAN-ielated protein RAZl (RAZl) 
mRNA, partial cds. 


1104 


100 


22 


AAG93274 


Homo sapiens 


Human protein HP10543, 


900 


100 


22 


AAB42846 


Homo sapiens 


Human ORFX ORF2610 polypeptide 
sequence SEQ ID NO:5220. 


900 


100 


23 


gi7242963 


Homo sapiens 


mRNA for KIAA1304 protein, partial 
cds. 


5409 


99 


23 


gi3413874 


Homo sapiens 


mRNA for KIAA0456 protein, partial 
cds. 


3695 


67 


23 


AAB30852 


Homo sapiens 


Amino acid sequence of human signal 
transduction protein SGT6-1. 


3685 


68 


24 


AAG64386 


Homo sapiens 


Human alcohol dehydrogenase 39. 


1228 


77 


24 


gil2861800 


Mus musculus 


putative 


1083 


66 


24 


gi3878713 


Caenorbabditis 
elegans 


weak similarity with quinone 
oxidoreductase, contains similarity to 
Pfam domain: PF00107 (Zinc-binding 
dehydrogenases), Score=-80.6, E- 
value=6.2e-06, N=l'-cDNA EST 
ykl64b4.5 comes from this 
gene-<DNA EST ykl64b4.3 comes 
from diis gene-cDNA EST yk264£3.5 
comes from this gene 


556 


39 


25 


AAE02629 


HonK> sapiens 


Human secreted protein Zalpha37. 


2481 


100 


25 


gil4536691 


Homo sapiens 


imnamed protein product 


2481 


100 


25 


AAY99419 


Homo sapiens 


Human PRO1780 (UNQ842) amino 
acid sequence SEQ ID NO:282. 


1960 


77 


26 


gi61Q2869 


Homo sapiens 


mRNA; cDNA DKFZip434H1235 
(from clone DKF^34H1235); partial 
cds. 


831 


100 


zO 




Mus musculus 


putative 




y4 


26 


gi2198807 


Gallus gallus 


monocarboxylate transporter 3 


505 


29 


27 


gi7299069 


Drosophila 
melanogaster 


CGI 1755 gene product 


205 


34 


27 


gi3875367 


Caenorhabditis 
elegans 


contains 3 cysteine rich repeats 


136 


41 


27 


gi3249080 


Aiabidopsis 


Contains similarity to MYB 
transcrQ>tion factor isolog T01O24.1 
gb|2288980 from A. tfaaliana BAC 
gb|AC002335. 


69 


35 


28 


gil 1041628 


Homo sapiens 


RPL6 gene for ribosomal protein L6, 
complete cds. 


1207 


98 


28 


gi433416 


Homo sapiens 


Human mRNA for DNA-binding 
protein, TAXREB107, complete cds. 


1207 


98 
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28 


gil3278717 


Homo sapiens 


libosoma] protein L6, clone 
MGC:1635 IMAGE:2823733, mRNA, 
complete cds. 


1207 


98 


29 


AAG03810 


Homo sapiens 


Human secreted protein, SEQ ID NO: . 
7891. 


845 


100 


29 


gil 86800 


Homo sapiens 


Human ribosomal protein L12 mRNA, 
complete cds. 


845 


100 


29 


gil4198333 


Homo sapiens 


ribosomal protein L12, clone 
MGC:9760 IMAGE:3855674, mRNA, 
complete cds. 


845 


100 


30 


AAB95051 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 16849. 


2965 


100 


30 


gil0433519 


Homo sapiens 


cDNAFU12118fis, clone 
MAMMA1000085, weakly similar to 
PUTATIVE CYSTEINYL-TRNA 
SYNTHETASE C29E6,06C (EC 
6.1.1.16). 


2965 


100 


30 


gil3938199 


Homo sapiens 


hypothetical protein FU121 18, clone 
MGC: 15044 IMAGE:2822557, mRNA, 
complete cds. 


2959 


99 


31 


ldl2858123 


Mus musculus 


putative 


2441 


73 


31 


gi7959195 


Homo sapiens 


mRNA for KIAA1467 protein, partial 
cds. 


2232 


100 


31 


gil3278148 


Mus musculus 


Similar to RIKEN cDNA 8430419L09 
gene 


794 


83 


32 


gil5530305 


Homo sapiens 


Similar to RIKEN cDNA 1700045119 
gene, clone MGC:2647 
IMAGE:3509621, mRNA, complete 
cds. 


1245 


84 


32 


gi9858803 


Mus musculus 


Zfp228 


512 


47 


32 


AAG75629 


Homo sapiens 


Human colon cancer antigen protein 
SEOIDNO:6393. 


511 


46 


33 


gi8101071 


Homo sapiens 


golgin-like protein (GLP) gene, 
complete cds. 


312 


46 


33 


gi8099669 


Homo sapiens 


golgin-like protein (GLP) mRNA, 
complete cds. 


312 


46 


33 


gil 1037008 


Human 
heipesvirus8 


latent nuclear antigen 


245 


40 


34 


gi437985 


Canis 
familians 


Rabl2 protein 


1071 


99 


34 


gi206531 


Rattus 
nozvegicus 


RAB12 


995 


96 


34 


gil2851149 


Mus musculus 


putative 


819 


96 


35 


gil3543689 


Homo sapiens 


Similar to RIKEN cDNA 4933405K01 
gene, clone MGC: 14799 
IMAGE:4068454, mRNA, complete 
cds. 


1077 


96 


35 


Kil2805373 


Mus musculus 


Unknown (protein for MGC:7298) 


950 


84 


35 


Kil2855529 


Mus musculus 


putative 


642 


79 


36 


gil2697979 


Homo sapiens 


mRNA for KIAA1717 protein, partial 
cds. 


1982 


100 


36 


gil651678 


Synechocystis 
sp. PCC6803 


ORFJD:slrl485~hypotfaetical protein 


185 


34 
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36 


gi2739367 


Arabidopsis 
thaliana 


putative phosphatidylinositol-4- 

phosphate 5 -kinase 


153 


28 


37 


gi3800892 


Homo sapiens 


neurexin IE-alpha gene, partial cds. 


1255 


99 


37 


gi294602 


Rattus 
norvegicus 


neurexin ni-a]pha 


1160 


91 


37 


gi205716 


Rattus 
norvegicus 


neurexin Il-alpha-a 


561 


50 


38 


gil0047315 


Homo sapiens 


niKNA for K1AA1619 protein, partial 
cds. 


4447 


99 


38 


gi8217424 


Homo sapiens 


Human DNA sequence from clone 
RP11-108L7 on chromosome 10. 
contains part of the gene for a novel 
Insulin-like growth factor binding type 
protein with Kazal-type serine protease 
inhibitor domain, the gene for a novel 
protein smiilar to rat tricarboxylate 
carrier, the gene for a novel PDZ 

^lyilXxf J UUuJaUl JJIUICUI, LUC 

gene for a novel protein similar to 
KIAA0552, KIAA0341 and Fugu 
hypothetical protein 2, die gene for a 
novel protein similar to Plasmodium 
POMl and C. elegans F46G1 1.1, a 
putative novel gene, tilie SEMA4G gene 
for semaohorin 4G and a novel cene 
Contains ESTs, STSs, GSSs and seven 
putative CpG islands, complete 
sequence. 


4407 


99 


38 


gi4836757 


Mus musculus 


semapborin subclass 4 member G 


4021 


90 


39 


gil0438664 


Homo saniens 


cDNA' FU22324 fis clone 
HRC05551. 


307 


100 


39 


gil3559240 


Homo sapiens 


Human DNA sequence from clone 
RP5-842G6 on chromosome 20. 
Contains the 3' end of a novel gene, the 
3* end of the gene for a novel protein 
similar to SELIL (sel-1 (suppressor of 
lin-12, C.elegans)-like), ESTs, STSs 
and GSSs, complete sequence. 


307 


100 


39 


gil3543669 


Homo sapiens 


hypothetical protein FLJ22324, clone 
MGC:14701 IMAGE:4247211,mRNA, 
complete cds. 


307 


100 


40 


gil4595019 


Homo sapiens 


mRN A for keratin 6 irs (KRT6IRS 

gene). 


2615 


99 


40 


gi6092075 


Mus musculus 


type n cytokeratin 


2414 


91 


40 


gil5559584 


Homo sapiens 


Similar to keratin 6A, clone 
MGC:20671 IMAGE:3639270, mRNA, 
complete cds. 


1468 


57 


41 


gil2655452 


Homo sapiens 


mRNA for keratin associated protein 
4.7 (KRTAP4.7 gene). 


1157 


86 


41 


gil2655464 


Homo sapiens 


partial mRNA for keratin associated 
protein 4.15 (KRTAP4.15 gene). 


1090 


88 


41 


gil2655462 


Homo sapiens 


mRNA for keratin associated protein 
4.14 (KRTAP4.14 gene). 


1063 


84 
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complete sequence. 






59 


gi5802814 


Homo sapiens 


endogenous retrovirus HERV-K103, 
complete sequence. 


146 


zo 


60 


AAB94756 


Homo sapiens 


Human protein sequence SEQ ID 
NO:15815. 


126 




60 


gi332612 


Gibbon ape 
leukemia virus 


pol polyprotein 


111 
113 




60 


fii3 133302 


Sus scrofa 


pol protein 


110 


53 


61 


gil0121625 


Gillichtliys 
miiabilis 


60S acidic ribosomal protein P 1 


127 


81 


61 


AAB44012 


Homo sapiens 


Human cancer associated protein 
sequence SEQ ID NO: 1457. 


125 


78 


61 


AAB43434 


Homio sapiens 


Human cancer associated protein 
sequence SEO ID NO:879. 


125 


78 


62 


AAB12585 


Homo sapiens 


Human T cell activating protein SEQ 
IDN0:4. 


140 


37 


62 


ril2805221 


Mus musculus 


lymphocyte antigen 6 complex 


140 


37 


62 


gil98924 


Mus musculus 


LY-6A.2 


140 


37 


63 


gi6969165 


Homo sapiens 


Human DNA sequence from clone 
RP3-475N16 on chromosome 6pl2.3- 
21.2. Contains the genes for CTG4A, 
pre-T cell receptor alpha, a novel 
protein similar to RPL7 A (60S 
ribosomal protein L7A) and the 3' end 
of gene KIAA0240. Contains ESTs, 
STSs, GSSs and four putative CpG 
islands, complete sequence. 


573 


67 


63 


Ril2841727 


Mus musculus 


putative 






63 


gil5293877 


Ictalurus 
punctatus 


ribosomal protein L7 


314 


38 


64 


gil81573 


Homo sapiens 


Human cytokeratin 8 (CK8) gene, 
complete cds. 


114/ 


'70 


64 


gil81400 


Homo sapiens 


Human cytokeratin 8 mRNA» con^lete 
cds. 


1147 


78 


64 


gi400416 


Homo sapiens 


H.sapiens KRT8 mRNA for keratin 8. 


1147 


79 


65 


gil3620887 


Mus musculus 


mitochondrial ribosomal protein S6 


o33 


lUU 


65 


gil3620885 


Homo sapiens 


MRPS6 mRNA for mitochondrial 
nbosomal protem S6. parbal cos. 


565 




65 


gil4603226 


Homo sapiens 


clone MGC: 19576 IMAGE:4304420, 
mRNA, complete cds. 




OJ 


66 


gll3dJ 


nOIuO SapivUo 


mRNA for PAR-6 eamma comolete 
cds. 


1956 


100 


66 


fii8037909 


Mus musculus 


PAR6A 


1490 


76 


66 


ei9453884 


Homo sapiens 


mRNA for 16-5-5, partial cds. 


1304 


93 


67 


AAB95293 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17517. 


776 


79 


67 


AAG81270 


Homo sapiens 


Human AFP protein sequence SEQ ID 
NO;58, 


776 


79 


67 


eil4035848 


Homo sapiens 


unnamed protein product 


116 


79 


68 


pi7020759 


Homo sapiens 


cDNA FU20565 fis, clone REC00542. 


930 


60 


68 


eil5216181 


Homo sapiens 


mRNA for putative 67-1 1-3 protein. 


927 


60 


68 


gil5930a69 


Homo sapiens 


Similar to hypothetical protein 
FU20565, clone MGC:8850 


917 


60 
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IMAGE:3914396, mRNA, complete 

cds- 






69 


gi3228237 


Homo sapiens 


UHS KerB gene. 


810 


72 


69 


gi200962 


Mus musculus 


serine 1 ultra hiph sulfiir nrotein 


755 


69 


69 


gi32472 


Homo sapiens 


H.sapiens mRNA for high-sulphur 
keratin 


749 


71 


70 


AAB92789 


Homo sapiens 


Human protein sequence SEQ ID 
NO- 11284 


3518 


100 


70 






cDNA FTJ 10407 fi<! Hnne 
NT2RM4000520. 


3518 


100 


70 






livnAtVipriml rimtPiTi T*l 110407 /'Iatip 

MGC:970 IMAGE: 3 509727, mRNA, 


"^Sl 1 


QO 
yy 


71 


ffi 13325 178 




Similar to RIKFN cDNA 22 1 001 16 
gene, clone MGC:10999 
IMAGE:3638524, mKNA, conq}lete 
cds. 


856 


100 


71 


gi7291278 


Drosophiia 
melanogaster 


CXj9752 gene product 


744 


43 


71 


gi2854153 


Caenorhabditis 
elecans 


Hypothetical protein C11D2.4 


729 


45 


72 


gi7020991 


Homo sapiens 


cDNA FU20718 fis, clone HEP17872. 


3013 


100 


72 


gil5680I44. 


Homo sapiens 


hypothetical protein FLJ20718, clone 
IMAGE:4577269. mRNA, partial cds. 


2906 


99 


72 


gil0801646 


Macaca 
fascicularis 


hypothetical protein 


1097 


99 


73 


AAG93290 


Homo sapiens 


Human protein HP10650. 


1215 


100 


73 


gil4587195 


Homo sapiens 


FAPPl -associated protein 1 (FASPl) 
mRNA, complete cds. 


1215 


100 


73 


gi8n8225 


Homo sapiens 


chromosome 21 unknown mRNA. 


1215 


100 


74 


gil0436998 


Homo sapiens 


cDNA: FLJ21011 fis, clone 
CAE04289. 


2522 


100 


74 


gil5030282 


Homo sapiens 


clone MGC:16827 IMAGE:3855873, 
mRNA, complete cds. 


2522 


100 


74 


gi8570641 


Homo sapiens 


clone 133K02 unknown mRNA. 


2514 


99 


75 


gi6599255 


Homo sapiens 


mRNA; cDNA DKFZp434C0328 
(from clone DKFZp434C0328). 


1612 


100 


75 


gi6330416 


Homo sapiens 


mRNA for KIAA1201 protein, partial 
cds. 


554 


38 


75 


AAB74726 


Homo sapiens 


Human membrane associated protein 
MEMAP-32. 


496 


35 


76 


gi7021059 


Homo sapiens 


cDNA FU20758 fis. clone HEP01508, 


1450 


100 


76 


AAW88552 


Homo sapiens 


Secreted protein encoded by gene 19 
clone HSAVU34. 


1429 


100 


76 


gil5341707 


Homo sapiens 


clone MGC:19979 IMAGE:3939273, 
mRNA, complete cds. 


1429 


100 


77 


AAB95410 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 17796. 


774 


100 


77 


gil0435394 


Homo sapiens 


CDNAFU13391 fis, clone 
PLACE1001241, 


774 


100 


77 


gil0503974 


Homo sapiens 


clone SP24 unknown mRNA. 


765 


99 


78 


gi7020587 


Homo sapiens 


cDNA FU20467 fis, clone KAT06638. 


737 


100 
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78 


AAB42883 


Homo sapiens 


Human ORFX ORF2647 polypeptide 
sequence SEQ ID NO:5294. 


530 


100 


78 


AAB56642 


Homo sapiens 


Human prostate cancer antigen protein 
sequence SEQ ID NO:1220. 


530 


100 


79 


AAW93948 


Homo sapiens 


Human regulatory molecule HRM-4 

protein. 


441 


91 


79 


gil2852696 


Mus musculus 


putative 


386 


47 


79 


gil2751103 


Homo sapiens 


PNAS-129 mRNA, complete cds. 


348 


100 


80 


gi7243053 


Homo sapiens 


mRNA for KIAA1336 protein, partial 
cds. 


3851 


99 


80 


gi7292144 


Drosophila 
melanogaster 


CG2069 gene product 


1634 


44 


80 


gil065457 


Caenoifaabditis 
elegans 


C54G7.4 gene product 


706 


25 


81 


gil0439581 


Homo sapiens 


cDNA: FLJ23023 fis, clone 
LNG01678. 


652 


100 


81 


gi7021132 


Homo sapiens 


cDNA FU20813 fis, clone 
ADSE01247. 


652 


100 


81 


AAG74674 


Homo sapiens 


Human colon cancer antigen protein 
SEOIDNO:5438. 


556 


92 


82 


gi526261 1 


Homo sapiens 


mRNA; cDNA DKFZp434I114 (from 
clone DKFZp434Il 14); complete cds. 


838 


100 


82 


gil 1493368 


Homo sapiens 


Human DNA sequence from clone 
RP5-1009E24 on chromosome 20 
Contains the SN gene encoding 
sialoadhesin, a novel gene similar to 
KIAA0417, the CENPB gene for 
centromere protein B, the CDC25B 
gene for Cell division cycle protein 
258, three novel genes, ttie 5* end of 
gene KIAA1271, nine CpG islands, 
ESTs, STSs and GSSs, conq>lete 
sequence. 


838 


100 


82 


gil3543798 


Mus musculus 


RIKEN cDNA 4931426K16 gene 


680 


92 


83 


AAB57003 


Homo sapiens 


Human prostate cancer antigen protein 
sequence SEQ ID NO:1581. 


1302 


99 


83 


AAR60558 


Homo sapiens 


Humanbasiginl. 


1302 


99 


83 


gi3492872 


Homo sapiens 


chromosome 19, cosmid F18382 
(LLNLF-140D2) and 3* overlapping 
xestriction fragment, complete 
sequence. 


1302 


99 


84 


gi9187614 


Homo sapiens 


mRNA full length insert cDNA clone 
EUROMAGE 1759349. 


580 


100 


84 


AAB01394 


Homo sapiens 


Neuron-associated protein. 


70 


39 


84 


AAB54358 


Homo sapi^ 


Human pancreatic cancer antigen 
protein sequence SEQ ID NO:810. 


70 


39 


85 


fiil5986445 


Homo sapiens 


p90 autoantigen mRNA, complete cds. 


4513 


99 


85 


gi7959315 


Homo sapiens 


mRNA for KIAA1524 protein, partial 
cds. 


4357 


99 


85 


AAB95207 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17311. 


2341 • 


100 


86 


gi7959231 


Homo sapiens 


mRNA for KIAA1485 protein, partial 
cds. 


5813 


99 
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86 


AAB40418 


Homo sapiens 


Human ORFX ORF182 polypeptide 
sequence SEQ ID NO:364. 


708 


99 


86 


gi5901529 


Homo sapiens 


C2H2 type KruppeMike zinc finger 
protein splice variant b (ZNF236) 
mRNA, complete cds. 


520 


24 


87 


gi7243270 


Homo sapiens 


mRNA for iaAA1436 protein, partial 
cds. 


4604 


99 


87 


gi5051974 


Mus musculus 


F2 alpha prostoglandin regulatory 
protein 


4195 


89 


87 


gil054884 


Rattus 
norvegicus 


prostaglandin F2a receptor regulatory 
protein precursor 


4191 


88 


88 


gil3241286 


Mus musculus 


GABA(A) receptor-associated protein- 
like 2 


607 


100 


88 


gi2 104570 


Rattus 
norvegicus 


GEF-2 


607 


100 


88 


gi4433387 


Bos tauTUS 


general protein transport factor pi 6 


607 


100 


89 


gil5859535 


Homo sapiens 


unnamed protein product 


5935 


99 


89 


gi3043606 


Homo sapiens 


mRNA for KIAA0541 protein, partial 
cds. 


5890 


100 


89 


gil5624075 


Homo sapiens 


TGF-beta resistance-associated protein 
TRAG (TRAG) mRNA, partial cds. 


5719 


96 


90 


gi337370 


Homo sapiens 


Human rapamycin- and FK506-binding 
protein, conmlete cds. 


740 


100 


90 


gil3097252 


Homo sapiens 


Similar to FK506 binding protein 2 (13 
kDa), clone MGC:5177 
IMAGE:3445148, mRNA, conq>lete 
cds. 


740 


100 


90 


AAQ31004 aa 
1 


Homo sapiens 


hRFKBPcDNA. 


735 


99 


91 


gil2053147 


Homo sapiens 


mRNA; cDNA DKF2p434F1726 (from 
clone DKFZp434F1726). 


1450 


100 


91 


gi412195 


Homo sapiens 


unknown 


265 


98 


91 


AAR04931 


Homo sapiens 


Interferon-gamma receptor segment 
from clone 39 responsiblefor binding 
the target 


260 


96 


92 


gil0437948 


Homo sapiens 


cDNA: FU21783 fis, clone HEP00284. 


3276 


100 


92 


AAB95352 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17643. 


1953 


99 


92 


gil0435077 


Homo sapiens 


cDNAFU13171fis, clone 
NT2RP3003819. 


1953 


99 


93 


gil2803319. 


Homo sapiens 


clone MGC:3090 1MAGE:3347913, 
mRNA, conplete cds. 


4837 


99 


93 


gil4044064 


Homo sapiens 


hypothetical protein DKFZp762Ml 15, 
clone MGC:I4418 IMAGE:4302613, 
mRNA, coffqjlete cds. 


4831 


99 


93 


gil0047337 


Homo sapiens 


mRNA for KIAA1630 protein, partial 

cds. 


4671 


100 


94 


AAB70535 


Homo sapiens 


Human PR05 protein sequence SEQ 

ID NO: 10. 


2979 


100 


94 


gil3185719 


Homo sapiens 


unnamed protein product 


2979 


100 


94 


AAB94106 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14334, 


2334 


100 


95 


gil2837873 | Mus musculus 


putative 


2370 


75 
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95 


gil3 195574 


Mus musculus 


Prajal isofomia 


2339 


75 


95 


AAB93847 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 13691. 


1941 


99 


96 


gi2224543 


Homo sapiens 


Human mRNA for KIAA0301 gene, 
partial cds. 


10626 


100 


96 


gi7529572 


Homo sapiens 


Human DNA sequence from clone 
RPl-12208 on chromosome 6ql4.2- 
16.L Contains the 3' part of a novel 
gene partially coded for by KIAA0301, 
a novel gene and the 3* part of the gene 
KIAA0957. Contains ESTs, STSs, 
GSSs and a putative CpG island, 
complete sequence. 


10626 


100 


96 


gil0727627 


Drosophila 
melanogaster 


CG13185 gene product 


1452 


34 


07 




Hnnnn oniens 


Human immunoff lobulin fecentor 

A AMAAHl 1 1 mill MM IIVRAWMAJ 1 1 

IRTA5 protein. 


2235 


100 


07 


oil 5528831 


TTfiTtm aniens 


Fc receptor-like protein 1 (FCRHl) 
mRNA, complete cds. 


2235 


100 


07 




HofTio aniens 


Human DNA seouence from clone 
RP 1 1-367J7 on chromosome 1 . 
Contains (part of) two or more genes 
for novel Immunoglobulin domains 
containing proteins, a SON DNA 
binding protein (SON) pseudogene, a 
voltage-dependent anion channel 1 
(VDACl) (plasmalemmal porin) 
pseudogene, ESTs, STSs and GSSs, 
complete sequence. 


1533 


100 


98 


AAB82318 


Homio sapiens 


Human immunoglobulin receptor 
IRTA5 protein. 


2177 


98 


98 


gil 5528831 


Homo sapiens 


Fc receptor-like protein 1 (FCRHl) 
mRNA, complete cds. 


2177 


98 


98 


gi9930921 


Homo sapiens 


Human DNA sequence from clone 
RPl 1-367J7 on chromosome 1. 
Contains (part of) two or more genes 
for novel Immunoglobulin domains 
containing proteins, a SON DNA 
binding protein (SON) pseudogene, a 
.voltage-dependent anion channel 1 
(VDACl) (plasmalemmal porin) 
pseudogene, ESTs, STSs and GSSs, 
conplete sequence. 


1533 


100 


99 


gil0438861 


Homo sapiens 


cDNA: FU22461 fis, clone 
HRC10107. 


4904 


100 


99 


gil5079400 


Homo sapiens 


clone MGC:16796 IMAGE;3855477, 
mRNA, complete cds. 


4899 


99 


99 


AAU03497 


Homo sapiens 


Human sterol sensing domain protein. 


4047 


99 


100 


gi6524024 


Mus musculus 


mammalian inositol hexakisphosphate 
kinase 1 


1031 


50 


100 


gil 0280996 


Rattus 
norvegicus 


inositol hexakisphosphate kinase 


1027 


49 


100 


gi6683115 


Homo sapiens 


mRNA for KIAA0263 protein, partial 


1021 


49 
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cds. 






101 


gi6524024 


Mils musculus 


mammalian inositol hexakisphosphate 
kinase 1 


1037 


51 


101 


gil0280996 


Rattus 
norvegicus 


inositol hexakisphosphate kinase 


1033 


50 


101 


gi6683115 


Homo sapiens 


mRNA for KIAA0263 protein, partial 
cds. 


1027 


50 


102 


gil3623311 


Homo sapiens 


clone IMAGE:3948563, mRNA, 
partial cds. 


1629 


100 


102 


gi3135968 


Homo sapiens 


Human DNA sequence from clone 
XXbac-3418 on chromosome 6p21.3- 
22,1. Contains the 5' end of the 
ZNF184 gene for Kruppel-like zinc 
finger protein 184, a heterogeneous 
nuclear ribonucleoprotein A 1 
(HNRPAl) pseudogene, a CD83 
antigen pseudogene, ESTs, STSs, GSSs 
and three QpG islands, cotaplQts 
sequence. 


1627 


47 


102 


gil769491 


Homo sapiens 


Human kruppel-related zinc finger 
protein (ZNF184) mRNA, partial cds. 


1625 


47 


103 


gil6]98398 


Homo sapiens 


clone MGC:27353 IMAGE:4671816, 
mRNA, complete cds. 


2606 


85 


103 


gi829151 


Homo sapiens 


H.sapiens ZNF37A mRNA for zinc 
finger protein. 


1371 


99 


103 


gi9801232 


Homo sapiens 


Human DNA sequence from clone 
RPl I-508N22 on chromosome 10 
Contains part of a novel gene 
(HSPC025), part of the ZNF37A (zinc 
finger protein 37a (KOX 21)) gene, 
part of a putative novel gene, ESTs, 
STSs, GSSs and a CpG Island, 
complete sequence. 


1371 


99 


104 


gil2053123 


Homo sapiens 


mRNA; cDNA DKFZp434K1421 
(from clone DKFZp434K1421); 
complete cds. 


2624 


100 


104 


gi7292866 


Drosopbila 
melanogaster 


CG15747 gene product 


362 


31 


104 


gi7549210 


Babesia 
bigemina 


200 kDa antigen p200 


298 


21 


105 


gil2053123 


Homo sapiens 


mRNA; cDNA DKFZp434K1421 
(from clone DKFZp434K1421); 

complete cds. 


2898 


100 


105 


gi6841130 


Homo sapiens 


HSPC095 mRNA, partial cds. 


419 


100 


105 


gi7292866 


Drosophila 
melanogaster 


CG15747 gene product 


364 


30 


106 


gil0438207 


Homo sapiens 


cDNA: FU21977 fis. clone HEP05976. 


1978 


99 


106 


gil5012167 


Homo sapiens 


hypothetical protein FLJ21977, clone 
MGC:14918 IMAGE:3936410, mRNA, 
con^lete cds. 


1974 


99 


106 


AAB42499 


Homo sapiens 


Human ORFX ORF2263 polypq)tide 
sequence S£Q ID NO:4526. 


1392 


100 


107 


gil228035 


Honoo sapiens 


Human mRNA for KIAA0191 gene. 


8020 


99 
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partial cds. 






107 


gil2697967 


Homo sapiens 


mRNA for KIAA171 1 protein, partial 
cds. 


1593 


58 


107 


AAB94636 


Homo sapiens 


Human protein sequence S£Q ID 
NO:15515. 


1004 


52 


108 


AAG81252 


Homo sapiens 


Human AFP protein sequence SEQ ID 
NO:22. 


2146 


99 


108 


gil4035812 


Homo sapiens 


unnamed protein product 


2146 


99 


108 


gil0440123 


Homo sapiens 


cDNA: FLJ23436 fis, clone 
HRC12692. 


2054 


100 


109 


gi200009 


Mus musculus 


myosin I 


5386 


96 


109 


gil666471 


Mus musculus 


myosin I heavy chain 


5360 


94 


109 


gi56733 


Rattus 
norvegicus 


myosin I heavy chain 


5268 


91 


110 


gil2053045 


Homo sapiens 


mRNA; cDNA DKFZp434K1115 
(from clone DKF2p434Kl 1 15); 
con^lete cds. 


4840 


100 


110 


AAB65631 


Homo sapiens 


Novel protein kinase, SEQ ID NO: 158. 


4835 


99 


110 


gil4133215 


Homo sapiens 


mRNA for KIAA0781 protein, partial 
cds. 


4678 


100 


111 


gil2642596 


Homo sapiens 


nuclear receptor co-repressor/HDAC3 
complex subunit TBLRl (TBLRl) 
mRNA, complete cds. 


2725 


100 


111 


AAB95225 


Homo sapiens 


Human protein sequence SEQ ED 
NO:17352. 


2720 


99 


111 


gil0434648 


Homo sapiens 


cDNA FU12894 fis, clone 
NT2RP2004170, moderately similar to 
Homo sapiens mRNA for transducin 
(beta) like 1 protein. 


2720 


99 


112 


gi2224557 


Homo sapiens 


Human mRNA for KIAA0308 gene, 

partial cds. 


6666 


99 


112 


AAY23330 


Homo sapiens 


Human tumour suppressor (kismet) 
protein. 


5759 


98 


112 


gi7243213 


Homo sapiens 


mRNA for KIAA1416 protein, partial 
cds. 


5264 


59 


113 


gil2856019 


Mus musculus 


putative 


1527 


95 


113 


gi3947604 


Caenoibabditis 
elegans 


cDNA EST ykl29fl.3 comes from this 
gene-cDNA EST ykl29fl.5 comes 
from this gene-cDNA EST yk203e4.3 
comes from ftns gene-cDNA EST 
ykl91a9.3 comes from this 
gene-cDNA EST yk262cl0.3 comes 
from this gene-cDNA EST yk278f9.3 
comes from this gene-cDNA EST 
yk325c7.3 comes from this 
gene-cDNA EST yk337fl.3 comes 
from Has gene-cDNA EST 3^449a23 
comes from this geneM:DNA EST 
yk203e4.5 comes from this 
gene-cDNA EST ykl91a9.5 comes 
from this gene-cDNA EST yk278f9.5 
comes from ^ gene-<DNA EST 
yk262cl0.5 comes from^s 


787 


41 
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gene-^cDNA EST yk325c7.5 comes 
from this gene-cDNA EST yk337fl .5 
comes from this geneM:DNA EST 
yk448gl0.5 comes from this 
genC'-cDNA EST yk449a2.5 comes 
from this gene-cDNA EST >ic636e2.3 
comes from this gene^cDNA EST 
yk636e2.5 comes from this 
gene-cDNA EST yk550e8.3 comes 
from this gene-cDNA EST yk557a9.3 . 
comes uom uiis gcne~cjL/XN/\. coi 
yk579cl2.3 comes from this 
genC'-cDNA EST yk614e7.3 comes 
from this gene-cDNA EST yk653fl .3 
comes from this gencM:DNA EST 
yk672b2.3 comes from this 
gene-cDNA EST yk550e8.5 comes 
from this gene-cDNA EST yk556bl .5 
comes from this gene-cDNA EST 
yk557a9.5 comes from this 
gene-cDNA EST yk579cl2.5 comes 

comes from this gene^DNA EST 
yk614e7.5 comes from this gene 






113 


gi3947603 


Caenorhabditis 
elegans 


cDNA EST ykl67h7.3 comes from this 
gene-cDNA EST ykl67h7.5 comes 
from tbis gene~cDNA EST yk289g5.3 
comes from gene-'cDNA EST 
yk332h9.3 comes from this 
gene-cDNA EST yk289g5.5 comes 
from this gene-cDNA EST yk332h9.5 
comes from this gene-cDNA EST 
yk391h4.5 comes from this 
geiie~cDNA EST yk653fl.5 comes 
from this gene 


787 


41 


114 


gi9280l36 


Macaca 
fascicularis 


unnamed protein product 


3431 


95 


114 


gi4262617 


Caenorhabditis 
elegans 


contains similarity to dual specificity 
phosphatase, catalyitic domain 
(Pfam:PF00782, Score=16,8, E=7.4c- 
05,N=1) 


470 


35 


114 


gi5706724 


Homo sapiens 


Cdcl4B3 phosphatase mRNA, 
complete cds. 


166 


30 


115 


AAB95254 


Homo sapiens 


Human protein sequence S£Q ID 
NO: 17423. 


3114 


99 


115 


gil4042385 


Homo sapiens 


cDNA FU14693 fis, clone 
NT2RP2005360, weakly similar to 
Homo sapiens sentrin/SUMO-specific 
protease (SENPl) mRNA. 


3114 


99 


115 


gilOS 14023 


Homo sapiens 


sentrin-specific protease (SENP2) 
mRNA, complete cds. 


3107 


99 


116 


gi4240227 


Homo sapiens 


mRNA for KIAA0869 protein, partial 
cds. 


4417 


98 
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116 


gil3879506 


Mus musculus 


Unknown (protein for 
IMAGE:3963643) 


4063 


89 


116 


AAB93267 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 12300. 


1895 


97 


117 


gil3235092 


Homo sapiens 


mRNA for testis specific protein A14 
(TSGA14 gene). 


1957 


100 


117 


gil 0438839 


Homo sapiens 


cDNA: FU22445 fis, clone 
HRC09438. 


1950 


99 


117 


gil3235344 


Mus musculus 


testis specific protein a 14 


1704 


87 


118 


gi7959279 


Homo sapiens 


mRNA for IC1AA1509 oroteirL nartial 
cds. 


6769 


99 


118 


AAB94101 


Homo saniens 


HumflTi nrotein <?pfliience 53PO IT) 
NO- 14322 






118 


gil0434073 


Homo sapiens 


CDNAFU12531 fis, clone 
NT2RM4000199 


1871 


99 


119 


AAM00936 


Homo saoiens 


Human bone marrow nroteiiL SEO ID 
NO: 412. 


3350 


100 


119 


AAB42828 


Homo sapiens 


Human ORFX ORF2592 polypeptide 
sequence SEQ ID NO:5184. 


2064 


100 


119 


gi9557949 


Homo sapiens 


mRNA for hypothetical protein 
(ORFI), clone 

Telethon(Italy B41) Strait02270 FLl 
42. 


1931 


100 


120 


AAB 11082 


Homo sapiens 


Human secreted protein ZALPHA13 
protein. 


2783 


93 


120 


gill230043 


Homo sapiens 


unnamed protein product 


2783 


93 


120 


AAB37988 


Homo sapiens 


Human secreted protein encoded by 
gene S clone HDPAS92. 


2741 


93 


121 


gil2852526 


Mus musculus 


putative 


1689 


80 


121 


AAB41765 


Homo sapiens 


Human ORFX ORF1529 polypeptide 
sequence SEQ ID NO:3058. 


1576 


100 


121 


gi4406663 


Homo sapiens 


clone 24945 mRNA sequence, partial 
cds. 


1576 


100 


122 


AAR22958 


Homo sapiens 


Human proteasome component HCS. 


1010 


85 


122 


ffi220026 




Hiimnn mD^A fnr nmfpacmnp <ii^iinif 

HCS. 


1010 


OJ 


122 


ffi3790135 


Homo sanienc 


TTiimnri T^NA cennpnof* from plmip 

RPM91N21 on chromosome 6q27. 
Contains a 7 transmembrane recentor 
(ihodopsin family) (ol&ctory receptor 
like) pseudogene, the PDCD2 gene for 
programmed cell deadi 2 (RP8 
homolog), the TBP gene for TATA box 
binding protein, the gene for 
proteasome subunit HCS, BSTs, STSs 
and GSSs, complete sequence. 


1010 




123 


AAB21027 


Homo sapiens 


Human nucleic acid-binding protein, 
NuABP-31. 


1456 


100 


123 


AAB45146 


Homo sapiens 


Hunoan secreted protein sequence 
encoded by gene 27 SEQ ID NO:87. 


1456 


100 


123 


gi4884258 


Homo sapiens 


mRNA; cDNA DKFZp564O092 (fi-om 
clone DKFZp564O092); partial cds. 


1430 


100 


124 


eil3325436 


Homo sapiens 


Similar to RIKEN cDNA 


1394 


100 
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C33O013D18 gene, clone MGC:11226 
IMAGE:3937599, mRNA, conq)lete 
cds. 






124 


gil3559363 


Homo sapiens 


MRPL9 mRNA for mitochondrial 
ribosomal protein L9 (L9mt), con^lete 

cds. 


1388 


99 


124 


AAG93251 


Homo sapiens 


Human protein HP02612. 


1153 


86 


125 


AAB85507 


Homo sapiens 


Human protein kinase SGKl 64. 


2949 


100 


125 


gil3543922 


Homo sapiens 


Similar to RIKEN cDNA 5430416A05 
gene, clone MGC:12903 
IMAGE:3537086, mRNA, complete 
cds. 


2913 


100 


125 


eil2856491 


Nlus imiscuhis 


nutative 


2135 


79 


126 


gil2653817 


Homo sapiens 


Similar to Male-specific RNA 84Dd, 
cloiie MGC:3092 IMAGE:3349383, 
mRNA, complete cds. 


3399 


100 


126 


AAB94115 


Homo sapiens 


Human protein sequence S£Q ID 
NO: 14356. 


3392 


99 


126 


gil0434102 


Homo sapiens 


cDNA FU12549 fis, clone 
NT2RM4000689 


3392 


99 


127 


gi7243187 


Homo sapiens 


mRNA for KIAA1403 protein, partial 

cds. 


6448 


98 


127 


gil2652971 


Homo sapiens 


clone MGC:858 IMAGE:3357380. 
mRNA, complete cds. 


3992 


100 


127 


AAB92872 


Homo sapiens 


Human protein sequence SBQ ID 
NO: 11460. 


3987 


99 


128 


AAB94324 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14807. 


1779 


99 


128 


gil0434528 


Homo sapiens 


cDNAFU12816fis, clone 
NT2RP2002609, weakly similar to 2- 
HYDROXYMUCONIC 
SEMIALDEHYDE HYDROLASE (EC 
3.1.1.-). 


1779 


99 


128 


AAB42143 


Homo sapiens 


Human ORFX ORE 1907 polypeptide 
sequence SEQ ID NO:3814. 


1521 


100 


129 


gi6329945 


Homo sapiens 


mRNA for KIAAl 140 protein, partial 
cds. 


1857 


52 


129 


gil2805043 


Homo sapiens 


clone IMAG£:3461487, mRNA, 
partial cds. 


1279 


54 


129 


gi7302173 


Drosophila 
melanogaster 


BcDNA:LD21719 gene product 


1261 


35 


130 


AAB28199 


Homo sapiens 


Human HMG-17 non Mstone 
chromosomal protein. 


322 


75 


130 


gi306864 


Homo sapiens 


Human non-histone chromosomal 
protein HMG-17 mRNA, complete cds. 


322 


75 


130 


gi32329 


Homo sapiens 


Human HMG-17 gene for non-histone 
chromosomal protein HMG-17. 


322 


75 


131 


gil6041794 


Homo sapiens 


clone MGC:23591 IMAGE:4856946, 
mRNA, complete cds. 


2714 


99 


131 


gil5559462 


Homo sapiens 


Similar to old astrocyte specifically 
induced substance, clone MGC:20215 
IMAGE:4546950, mRNA, complete 
cds. 


2709 


99 
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131 


gi4519621 


Mus musculus 


OASIS protein 


2406 


91 


132 


gi7573591 


Homo sapiens 


Human DNA sequence from clone 
RP1-309K20 on chromosome 20 
Contains the gene for a novel protein 
similar to dysferlin, the SPAG4 gene 
for sperm associated antigen 4, the 
CPNEl gene for Copine I (similar to 
KIAA0636), the gene KIAA0765 
(HRIHFB2091) for an RNA 
recognition motif (RNP, RRM or RED 
domain) containing protein and the 3' 
end of the NIFS gene for cysteine 
desulfiirase. Contains ESTs, STSs, 
GSSs and four putative CpG islands, 
complete sequence. 


4972 


100 


132 


giI5559252 


Homo sapiens 


RNA binding motif protein 12, clone 
MGC:19528 IMAGE:3845090, mRNA, 
complete cds. 


4972 


100 


132 


gil5215375 


Homo sapiens 


RNA binding motif protein 12, clone 
MGC:16487 IMAGE:3956772, mRNA, 
complete cds. 


4972 


100 


133 


giI2697774 


Mus musculus 


acetyl-CoA synthetase 2 


3181 


87 


133 


gil2697772 


Bos taunis 


acetyl-CoA synthetase 2 


3056 


83 


133 


AAB34712 


Homo sapiens 


Human secreted protein encoded by 
DNA clone vo9 L 


2721 


100 


134 


gi7020783 


Homo sapiens 


cDNA FU20580 fis, clone REC005 16. 


848 


100 


134 


£il5012026 


Homo sapiens 


Similar to hypothetical protein 
FU20580, clone MGC:13430 
IMAGE:4093763, mRNA, conq}lete 
cds. 


848 


100 


134 


gil 2833008 


Mus musculus 


putative 


814 


85 


135 


AAB94473 


Homo sapiens 


Human protein sequence SEQ ID 
NO:15139. 


1970 


100 


135 


AAG74880 


Homo sapiens 


Human colon cancer antigen protein 
SEQIDNO:5644. 


1970 


100 


135 


AAB43720 


Homo sapiens 


Human cancer associated protein 
sequence SEQ ID N0:1 165. 


1970 


100 


136 


gil0047285 


Homo sapiens 


mRNA for KIAA1605 protein, partial 
cds. 


3610 


99 


136 


gil6215453 


Homo sapiens 


mRNA for bile acid beta-glucosidase. 


3610 


99 


136 


gil5030210 


Homo sapiens 


KIAA1605 protein, clone MGC:16895 
IMAGE:4339156, mRNA, complete 

cds. 


3610 


99 


137 


gi4914601 


Homo sapiens 


mRNA; cDNA DKFZp564A026 (from 
clone DKFZp564A026). 


4171 


94 


137 


AAB94357 


Homo sapiens 


Human protein sequence SEQ ID 
N0:14881. 


2195 


99 


137 


AAY45161 


Homo s^iens 


Human secreted protein clone 
C0139 3 protein sequence. 


2112 


100 


138 


gi313131 


Torpedo 
mannorata 


alpha-tubulin 


1192 


97 


138 


gil4198110 


Mus musculus 


tubulin alpha 1 


1192 


97 


138 


gil3435777 


Mus musculus 


tubulin alpha 6 


1192 


97 
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139 


AAB94856 


Homo sapiens 


Human protein sequence SEQ ID 
NO:16044. 


2138 


100 


139 


AAB94628 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 15490. 


2138 . 


100 


139 


gil0436294 


Homo sapiens 


cDNA FU13970 fis, clone 
Y79AA1001533, moderately similar to 
Mouse mRNA for RNA polymerase I 
associated factor (PAF53). 


2138 


100 


140 


gi2642187 


Rattus 
norvegicus 


endo-alpha-D-inannosidase 


1415 


67 


140 


AAB95204 


Homo sapiens 


Human protein sequence SEQ ID 
NO; 17303. 


1094 


66 


140 


gil 0434559 


Homo sapiens 


cDNA FLJ12838 fis, clone 
NT2RP2003230, moderately similar to 
Rattus norvegicus endo*a!pha-D- 

mannosidase (Enman) mRNA. 


1094 


66 


141 


gi3449308 


Homo sapiens 


mRNA for MEGF8, partial cds. 


9785 


100 


141 


gi6681364 


Rattus 
norvegicus 


MEGF8 


4772 


95 


141 


gil0728654 


Drosophila 
melanogaster 


CG7466 gene product 


2902 


34 


142 


AAy29517 


Homo sapiens 


Human lung tumour protein SAL-82 
predicted amino acid sequence. 


3048 


100 


142 


gil3958036 


Homo sapiens 


F YVE-fmger protein EIP 1 mRNA, 
complete cds. 


3048 


100 


142 


AAY29861 


Homo sapiens 


Human secreted protein clone cb98_4. 


3041 


99 


143 


gil4718539 


Homo sapiens 


HIC-3 mRNA, con[g>lete cds. 


3178 


99 


143 


gi5689371 


Homo sapiens 


mRNA for KIAA1020 protein, partial 
cds. 


2970 


99 


143 


gi7328028 


Homo sapiens 


mRNA; cDNA DKFZi>434F0616 (firom 
clone DKFZp434F0616); partial cds. 


1738 


100 


144 


gil2620400 


Homo sapiens 


mitochondrial carrier protein CGI-69 
long form mRNA, complete cds. 


1856 


99 


144 


AAB42783 


Homo sapiens 


Human ORFX ORF2547 polypeptide 
sequence SEQ ID NO:5094. 


1804 


96 


144 


gil0438783 


Homo sapiens 


cDNA: FU22407 fis, clone 
HRC08407. 


1798 


97 


145 


gi2792366 


Homo sapiens 


unknown protein IT12 mRNA, partial 

cds. 


4390 


99 


145 


gil843399 


Homo sapiens 


mRNA, partial cds, clone:RES4-25. 


3676 


99 


145 


gil4602505 


Homo sapiens 


clone IMAGE:3936655, mRNA, 
partial cds. 


2366 


99 


146 


gil3359167 


Homo sapiens 


mRNA for KIAA1646 protein, partial 

cds. 


2581 


99 


146 


AAY96059 


Homo sapiens 


Human sphingosine kinase C. 


2456 


99 


146 


gi6572330 


Homo sapiens 


Human DNA sequence from clone 
59H18 on chromosome 22. Contains 
the 3' part of the gene for KIAA0767, a 
novel gene, ESTs, STSs, GSSs and a 
putative CpG island, complete 
sequence. 


1627 


96 


147 


gil4043303 


Homo sapiens 


exonuclease NEF-sp, clone 
MOC:15944 IMAGE:3537866. mRNA, 


4043 


100 
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complete cds. 






147 


gil3272524 


Homo sapiens 


exonuclease N£F-sp mRNA, conq^lete 
cds. 


4039 


99 


147 


gil2053043 


Homo sapiens 


mRNA; cDNA DKF2p434J0315 (from 
clone DKFZp434J0315); complete cds. 


3843 


95 


148 


gi7243037 


Homo sapiens 


mRNA for KIAA1328 protein, partial 
cds. 


2894 


100 


148 


gil3874541 


Macaca 
fascicularis 


hypothetical protein 


2492 


93 


148 


gil335313 


Homo sapiens 


Human muscle mRNA for emibryomc 
myosin heavy chain (SMHCE). 


129 


24 


149 


AAB42399 


Homo sapiens 


Human ORFX ORF2163 polypeptide 
sequence S£Q JD NO:4326. 


1362 


91 


149 


AAB42366 


Homo sapiens 


Human ORFX ORF2130 polypeptide 
sequence SEQ ID NO:4260. 


626 


100 


149 


gi7298594 


Drosophila 
melanogaster 


CG10189 gene product 


223 


35 


150 


AAB95372 


Homo sapiens 


Human protein sequence S£Q ID 
NO:17692. 


1538 


99 


150 


gil<M35150 


Homo sapiens 


cDNA FLJ13220 fis, clone 
NT2RP4002047, moderately similar to 
GTP-BINDING PROTEIN LEPA. 


1538 


99 


150 


gil0437720 


Homo sapiens 


cDNA: FU21595 fis, clone 
COL07069. 


1438 


100 


151 


gi3327080 


Homo sapiens 


mRNA for KIAA0633 protein, partial 
cds. 


6823 


99 


151 


gi857571 


Mus musculus 


cordon-bleu gene product 


1345 


81 


151 


gi6094680 


Homo sapiens 


PAC clone RP5-1 168M19 from 7pl2- 
ql 1.21, complete sequence. 


1342 


100 


152 


gil5451265 


Macaca 
fascicularis 


hypothetical protein 


2728 


98 


152 


AAB41597 


Homo sapiens 


Human ORFX ORF1361 polypeptide 
sequence SEQ ID NO:2722. 


2650 


100 


152 


gi5689443 


Homo sapiens 


mRNA for KIAA1053 protein, partial 
cds. 


2650 


100 


153 


gil4036062 


Homo sapiens 


unnamed protein product 


1930 


100 


153 


AAG81377 


Homo sapiens 


Human AFP protein sequence SEQ ID 
NO:272. 


1925 


99 


153 


gil28331I2 


Mus musculus 


putative 


1727 


88 


154 


gil2832455 


Mus musculus 


putative 


1220 


89 


154 


gil5080314 


Homo sapiens 


Similar to RIKEN cDNA 0610010020 
gene, clone MGC:20590 
IMAG£:4310241, mRNA, complete 
cds. 


514 


100 


154 


gi6002488 


Penicillium 

chrysogenum 


hypothetical protein 


338 


31 


155 


gil4017889 


Homo sapiens 


mRNA for KIAA1836 protein, partial 
cds. 


2511 


100 


155 


AAB94592 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 15402. 


972 


50 


155 


gil0435321 


Homo sapiens 


CDNAFU13337 fis, clone 
OVARC1001880. 


972 


50 


156 


gil4550510 


Homo sapiens 


pseudouiidylate synthase 1, clone 


2123 


100 



152 
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MGC:2736 IMAGE:2822709, mRNA, 
complete cds. 






156 


gil2804097 


Homo sapiens 


Similar to pseudoutidine synthase 1, 
clone MGC: 1 1268 IMAGE:3943243, 
mRNA, complete cds. 


2123 


100 


156 


gi4455035 


Homo sapiens 


pseudouridine synthase 1 (PUSl) 
mRNA, partial cds. 


1927 


99 


157 


AAY58052 


Homo sapiens 


Human protein kinase H2LAU20 
protein sequence. 


3198 


98 


157 


gi9652080 


Homo sapiens 


protein kinase D YRK4 (D YRK4) 
mRNA, partial cds. 


2844 


100 


157 


AAW71685 


Homo sapiens 


Amino acid sequence of human 
serine/threonine protein kinase. 


1909 


97 


158 


gi7300952 


Drosophila 
melanogaster 


BcDNA:LD21504 gene product 


971 


62 


158 


gi4972728 


Drosophila 
melanogaster 


unknown 


971 


62 


158 


AAB97646 


Homo sapiens 


Ribosomal S3 protein 17. 


831 


99 


159 


AAU02201 


Homo sapiens 


Phosphatase 1 protein-like protein, 
MEM6. 


1514 


100 


159 


gil5551577 


Homo sapiens 


unnamed protein product 


1514 


100 


159 


AAB95633 


Homo sapiens 


Human protein sequence S£Q ID 
NO:18363. 


1510 


99 


160 


gil2804573 


Homo sapiens 


Similar to CXjI 1334 gene product, 
clone MGC:3207 IMAGE:3501899, 
mRNA, con9)lete cds. 


1859 


100 


160 


gil2851419 


Mus musculus 


putative 


1590 


86 


160 


gi7302053 


Drosophila 
melanogaster 


CGI 1334 gene product 


1046 


59 


161 


gil580781 


Homo sapiens 


Human beige-like protein (BGL) 
mRNA, partial cds. 


9734 


99 


161 


gil0180266 


Mus musculus 


LBA 


9333 


86 


161 


gil0257401 


Mus musculus 


LBAisoformbeta 


8920 


86 


162 


gil5082589 


Homo sapiens 


clone MGC:4408 IMAGE:2906200, 
mRNA, complete cds. 


2065 


99 


162 


gil5638615 


Arabidopsis 
thaliana 


HENl 


350 


37 


162 


gil3241746 


Arabidopsis 
thaliana 


C0RYMB0SA2 


350 


37 


163 


gil5291227 


Drosophila 
melanogaster 


GH13040p 


701 


40 


163 


gi7303780 


Drosophila 
melanogaster 


CG12214 gene product 


701 


40 


163 


AAB95882 


Homo sapiens 


Human protein sequence SEQ ID 
NO:18991. 


501 


100 


164 


gi3327170 


Homo sapiens 


mRNA for KIAA0678 protein, partial 

cds. 


5255 


100 


164 


AAB95304 • 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17542, 


4431 


99 


164 


gil4134120 


Caenorhabditis 
elegans 


endocytosis protein RME-8 


2127 


42 


165 


AAB53427 


Homo sapiens 


Human colon cancer antigen protein 
sequence SEQ ID NO:967. 


813 


96 
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165 


gil3905098 


Mus musculus 


B-cell translocation gene 1, anti- 
proliferative 


813 


96 


165 


gi293306 


Mus musculus 


B-cell translocation gene-1 protein 


813 


96 


166 


gil3365897 


Macaca 
fesciculaiis 


hypothetical protein 


2501 


97 


166 


AAY02168 


Homo sapiens 


A facilitative glucose transporter 
protein GLUTS. 


870 


99 


166 


gil3445575 


Homo sapiens 


fecilitative glucose transporter 
GLUTIO (SLC2A10) mRNA, con^lete 
cds. 


835 


39 


167 


gil3365897 


Macaca 
fisciculaiis 


hypothetical protein 


2173 


97 


167 


AAY02168 


Homo sapiens 


A facilitative glucose transporter 
protein GLUTS. 


870 


99 


167 


gil3445575 


Homo sapiens 


facOitative glucose transporter 
GLUTIO (SLC2A10) mRNA, coii?>lete 
cds. 


678 


37 


168 


gil0047251 


Homo sapiens 


mRNA for KIAA1588 protein, partial 
cds. 


3292 


100 


168 


gil4424704 


Homo sapiens 


clone MGC:15071 IMAGE:4 110510, 
mRNA, complete cds. 


2315 


100 


168 


gi4567179 


Homo sapiens 


chromosome 19, BAG 37295 (CIT-B- 
21A4), conplete sequence. 


1269 


43 


169 


gil5558943 


Homo sapiens 


guanylate binding protein 4 mRNA, 
complete cds. 


3134 


99 


169 


gil 174187 


Mus musculus 


purine nucleotide binding protein 


2260 


70 


169 


gil93444 


Mus musculus 


guanylate binding protein 


1986 


66 


170 


gil4585859 


Homo sapiens 


hypothetical protein SB138 


1121 


100 


170 


gi6665778 


Mus musculus 


cyciin ania-6b 


1052 


92 


170 


gil2841169 


Mus musculus 


putative 


1052 


92 


171 


AAB64407 


Homo sapiens 


Amino acid sequence of human 
intracellular signalling molecule 
INTRA39. 


3394 


100 


171 


AAB71963 


Homo sapiens 


Human TGF-beta receptor encoded by 
cDNA clone HFIHY04. 


3394 


100 


171 


gil0438113 


Homo sapiens 


cDNA: FU21908 fis, clone HEP03830. 


3385 


99 


172 


gil2652533 


Homo sapiens 


clone MGC:2637 IMAGE: 3505 128, 
mRNA, complete cds. 


676 


89 


172 


AAB67453 


Homo sapiens 


Amino acid sequence of a human 
chaperone polypeptide. 


668 


88 


172 


gi9758421 


Arabidopsis 
thaliqna 


gene_id:MHF15.7^iinihir to unknown 
protein- 


199 


28 


173 


AAB97025 


Homo sapiens 


Human colon carcinoma suppressor 
gene-related protein. 


1773 


61 


173 


gi9857318 


Homo sapiens 


Asef mRNA for APC-stimulated 
guanine nucleotide exchange factor, 
complete cds. 


1773 


61 


173 


gi8809845 


Homo sapiens 


chromosome 2q22 RhoGEF mRNA, 
complete cds. 


1700 


61 


174 


gil2052828 


Homo sapiens 


mRNA; cDNA DKFZp564N1062 
(from clone DKF2p564N1062); 
corr]plete cds. 


1601 


99 


174 


gil28506O3 


Mus musculus 


putative 


1062 


92 
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174 


AAB94655 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 1 5568. 


671 


100 


175 


gil5080282 


Homo sapiens 


Similar to putative sialoglycopiotease 
type 2. clone MGC:20293 
IMAGE:4121450, mRNA, coiq)lete 
cds. 


1747 


99 


175 


gil 1071727 


Homo sapiens 


mRNA for putative sialoglycoprotease 
type 2, 


1707 


92 


175 


gil2847276 


Mus musculus 


putative 


1541 


84 


176 


AAB36628 


Homo sapiens 


Human FLEXHT-50 protein sequence 
SEQIDNO:50. 


527 


100 


176 


AAB94208 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14557, 


527 


100 


176 


AAG01512 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
5593. 


527 


100 


177 


gil5929052 


Homo sapiens 


Similar to RIKEN cDNA 2810442016 
gene, clone MGC:23 197 • 
IMAGE:4861869, niRNA, con^lete 
cds. 


2084 


100 


177 


gill493155 


Homo sapiens 


Human DNA sequence from clone 
RP5-852M4 on chromosome 20. 
Contains the gene encoding the HBV 
associated factor, a novel gene similar 
to Drosophilia CGI 7883, a putative 
novel gene, two CpG islands, ESTs, 
GSSs, and STSs, complete.sequence. 


1952 


100 


177 


gil2840168 


Mus musculus 


putative 


1938 


93 


178 


AAB87034 


Homo sapiens 


Human secreted protein TANGO 339, 
SEQIDN0:3. 


1449 


100 


178 


AAY76266 


Homo sapiens 


Human secreted protein encoded by 
gene 10 fragment 


1449 


100 


178 


AAB87135 


Homo sapiens 


Human secreted protein TANGO 339 
F20Y variant, SEQ ED NO:139. 


1446 


99 


179 


gi434763 


Homo sapiens 


Human mRNA for KIAA0120 gene, 
complete cds. 


1048 


100 


179 


gil4424677 


Homo sapiens 


transgelin2, clone MGC:15279 
IMAGE:4301018, mRNA, con5)lete 
cds. 


1048 


100 


179 


gi9956026 


Homo sapiens 


clone CDABP0035 mRNA sequence. 


1048 


100 


180 


AAB31677 


Homo sapiens 


Amino acid sequence of a human 
protein having a hydrophobic domain. 


2803 


100 


180 


AAE03346 


Homo sapiens 


Human gene 19 encoded secreted 
protein HCRNF14, SEQ ID NO: 120. 


2803 


100 


180 


AAE03310 


Homo sapiens 


Human gene 19 encoded secreted 
protein HCRNF14. SEQ ID NO:84. 


2803 


100 


181 


AAB41910 


Homo sapiens 


Human ORFX ORF1674 polypqptide 
sequence SEQ ID NO:3348. 


1530 


99 


181 


gi5262467 


Homo sapiens 


mRNA; cDNA DKFZp564I122 (from 
clone DKFZp564I122). 


1530 


99 


181 


gil2849716 


Mus musculus 


putative 


1259 


82 


182 


gi2072972 


Homo sapiens 


Human LI element L1.25 p40 and 
putative pi 50 genes, complete cds. 


497 


53 


182 


AAB64943 


Homo sapiens 


Human secreted protein sequence 


494 


54 
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encoded by gene 7 SEQ ID NO: 12 1 . 






182 


gi507O622 


Homo sapiens 


retrotransposon LI insertion in X- 
linked retinitis pigmentosa locus, 
complete sequence. 


494 


53 


183 


AAB59191 


Homo sapiens 


Human NADE. 


217 


47 


183 


gi8452894 


Homo sapiens 


p75NTR-associated cell death executor 
(NADE) mRNA, complete cds. 


217 


47 


183 


gil89379 


Homo sapiens 


Human unknown protein from clone 
pHGR74 mRNA, complete cds. 


217 


47 


184 


AAB88468 


Homo sapiens 


Human membrane or secretory protein 
donePSEC0263. 


4931 


97 


184 


gil4272788 


Homo sapiens 


unnamed protein product 


4931 


97 


184 


gi577301 


Homo sapiens 


Human mRNA for KIAA0090 gene, 
partial cds. 


4650 


99 


185 


AAG64953 


Homo sapiens 


Human ATP-dependent helicase 
protein 68, 


3169 


100 


185 


gil2052748 


Homo sapiens 


mRNA; cDNA DKF2p564B1023 
(ftom clone DKFZp564B1023); 
complete cds. 


2716 


100 


185 


gil2836314 


Mus musculus 


putative 


2655 


83 


186 


gil4017781 


Homo sapiens 


mRNA for KIAA1782 protein, partial 
cds. 


2834 


99 


186 


gi4062983 


Mus musculus 


Eos protein 


2747 


95 


186 


gil 1612390 


Homo sapiens 


zinc finger transcription factor Eos 
mRNA, complete cds. 


2603 


98 


187 


AAB95721 


Homo sapiens 


Himian protein sequence SEQ ID 
NO: 18592. 


2419 


100 


187 


gil0436538 


Homo sapiens 


cDNA FU14153 fis, clone 
NT2RM1000092, weakly similar to 
MULTIDRUG RESISTANCE 
PROTEIN 2. 


2419 


100 


187 


gil2248763 


Homo sapiens 


mRNA for SMAP-4, complete cds. 


2323 


96 


188 


gil3278906 


Homo sapiens 


clone MGC:4440 IMAGB:2959536, 
mRNA, complete cds. 


1040 


100 


188 


gil3278819 


Homo sapiens 


clone MGC:2776 IMAGE:2959536, 
mRNA, complete cds. 


1040 


100 


188 . 


AAB95829 


Homo sapiens 


Human protein sequence SEQ ID 
NO:18847. 


618 


79 


189 


gil4602977 


Homo siq>iens 


Similar to KIAA0789 gene product, 
clone MGC:16602 IMAGE:4 110708, 

mRNA, complete cds. 


3100 


99 


189 


gi3043570 


Homo sapiens 


mRNA forKIAA0523 protein, partial 
cds. 


2564 


100 


189 


gil4133217 


Homo sapiens 


mRNA for KIAA0789 protein, partial 
cds. 


1463 


49 


190 


gi9717245 


Mus musculus 


cytoplasmic dynein heavy chain 


5569 


98 


190 


gi294543 


Rattus 
norvegicus 


dynein heavy chain 


5557 


98 


190 


gi402528 


Rattus 
norvegicus 


cytoplasmic dynein heavy chain 


5535 


98 


191 


gil3537204 


Homo sapiens 


mRNA for MAST205, complete cds. 


6834 


98 


191 


gi406058 


Mus musculus 


protein kinase 


6343 


86 


191 


gi3882335 


Homo sapiens 


mRNA for KIAA0807 protein, partial 


6300 


98 
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cds. 






192 


gil2847109 


Mus musculus 


putative 


1356 


79 


192 


gil3623271 


Homo sapiens 


Similar to RIKEN cDNA 2600005P05 
gene, clone MGC: 11321 
IMAGE:39518p4, raRNA, complete 
cds. 


1332 


100 


192 


gil2847837 


Mus musculus 


putative 


1170 


76 


193 


gi38149 


Pongo 
pygraaeus 


epsilon-globin 


397 


100 


193 


gi903731 


Gorilla gorilla 


epsilon-globin 


397 


100 


193 


gi903707 


Pan 

troglodytes 


epsilon-globin 


397 


100 


194 


AAB74695 


Homo sapiens 


Human membrane associated protein 
MEMAP-l. 


1799 


100 


194 


AAE01340 


Homo sapiens 


Human gene 22 encoded secreted 
protein fragment, SEQ ID NO:205. 


1799 


100 


194 


gil5929183 


Homo sapiens 


modulator of apoptosis 1, clone 
MGC:9487 IMAGE;3922055. niRNA, 
complete cds. 


1799 


100 


195 


AAG93260 


Homo sapiens 


Human protein HP 10106. 


1769 


100 


195 


gil5029765 


Mus musculus 


RIKEN cDNA 2810039M17 gene 


1650 


91 


195 


gil2849932 


Mus musculus 


putative 


1650 


91 


196 


gil4017843 


Homo sapiens 


mRNA for KIAA1813 protein, partial 
cds. 


3434 


100 


196 


gil5 193290 


Homo sapiens 


LAPSERl (LAPSERl) mRNA, 
complete cds. 


3309 


100 


196 


gi8217421 


Homo sapiens 


Human DNA sequence from clone 
RP11-108L7 on chromosome 10. 
contains part of the gene for a novel 
Insulin-like growth iactor binding type 
protein with Kazal-type serine protease 

UlAUUAiMi UUUhUII^ UlC ^Cliv lUi d UUVCl 

protein similar to rat tricarboxylate 
carrier the oene for a nnvel 
(DHR^ GLGF^ domain nrotein. the 
gene for a novel protein similar to 
KIAA0552, K1AA0341 and Fugu 
hypothetical protein 2, the gene for a 
novel protein similar to Plasmodium 
POMl and C. elegans F46G11.U a 
putative novel gene» the SEMA4G gene 
for semaphoiin 4G and a novel gene. 
Contains ESTs, STSs, GSSs and seven 
putative CpG islands, con5)lete 
sequence. 


3264 


100 


197 


gil458241 


Caenorhabditis 
elegans 


Hypothetical protein B0507.2 


782 . 


39 


197 


gil2832510 


Mus musculus 


putative 


490 


89 


197 


AAB54014 


Homo sapiens 


Human pancreatic cancer antigen 
protein sequence SEQ ID NO:466. 


242 


100 


198 


gi500747 


Mus musculus 


capping protein beta-subimit, isofonn 1 


1440 


98 


198 


gi212902 


Gallus gallus 


actin-capping protein Z beta subunit 


1432 


98 


198 


gil2805189 


Mus musculus 


capping protein (actin filament) muscle 


1318 


92 
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Z-line, beta 






199 


gil4017787 


Homo sapiens 


mRNA for KIAA1785 protein, partial 
cds. 


3195 


100 


199 


gil3436428 


Homo salens 


Similar to feminization 1 a homolog 
(C. elegans), clone MGC:4216 
IMAG£:2957950, mRNA, con^lete 
cds. 


2197 


64 


199 


gil2836689 


Mus musculus 


putative 


2164 


65 


200 


gi7959811 


Homo sapiens 


PR01167 


389 


100 


200 


gi2736345 


Caenorhabditis 
elegans 


contains similarity to G-coupled protein 
receptors 


69 


33 


200 


gi7504953 


Caenorhabditis 
elegans 


hypothetical protein H22D07. 1 - 
Caenorhabditis elegans > 


69 


33 


201 


gil2697975 


Homo sapiens 


mRNA for KIAA1715 protein, partial 
cds. 


2230 


100 


201 


AAB42461 


Homo sapiens 


Human ORFX ORF2225 polypeptide 


1015 


100 


201 


gil2844031 


Mus musculus 


putative 


567 


92 


202 




melanogaster 






77 


202 


&i104^gQnO 




HRC10983. 


Krt 


07 


202 


ei5824430 


elegans 


cDNA HST vlr^Olh? S comes frnm tlii« 

gene-cDNA EST yk523d4.5 comes 
from this cene^cDNA EST vk553ffi 5 
comes from this gene-cDNA EST 
yk595gl2.5 comes from this 
gene-cDNA EST yk606gl0.5 comes 
from this gene-cDNA EST yk652G.5 
comes from this gene 




21 


203 


AAM00957 


Homo sapiens 


Human bone marrow protein, SEQ ID 

NO: 433. 


1725 


100 


203 


gi4151807 


Rattus 
norvegicus 


membrane-associated guanylate kinase- 
interacting protein 2 Maguin-2 


1484 


62 


203 


gi4151805 


Rattus 

norvegicus 


menibrane-associated guanylate kinase- 
interacting protein 1 Maguin-1 


1484 


62 


204 


AAM00844 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 207. 


1051 


98 


204 


gi4151807 


Rattus 
norvegicus 


membrane-associated guanylate kinase- 
interacting protein 2 Maguin-2 


779 


69 


204 


gi4151805 


Rattus 
norvegicus 


membrane-associated guanylate kinase- 
interacting protein 1 Maguin-1 


779 


69 


205 


AAM00957 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 433. 


1576 


92 


205 


gi4151807 


Rattus 
norvegicus 


menobrane-associated guanylate kinase- 
intcracting protein 2 Maguin-2 


1349 


57 


205 


gi4151805 


Rattus 
norvegicus 


membrane-associated guanylate kinase- 
interacting protein 1 Maguin-1 


1349 


57 


206 


gi7242969 


Homo sapiens 


mRNA for KIAA1307 protein, partial 
cds. 


8582 


99 


206 


AAM00860 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 223, 


4841 


98 


206 


gi4426611 


Drosophila 


pushover 


2137 


46 
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melanogaster 








207 


AAB62210 


Homo sapiens 


Human ABCA2 transporter protein. 


9835 


99 


207 


gil3173186 


Homo sapiens 


ABC transporter ABCA2 (ABCA2) 
mRNA, complete cds. 


9835 


99 


207 


gi9957467 


Homo sapiens 


ATP-binding cassette sub-family A 
member 2 (ABCA2} mRNA, Gon:q)iete 
cds. 


9835 


99 


208 


AAB94358 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14883. 


2268 


99 


208 


gil0434632 


Homo sapiens 


cDNAFLJ12886fis, clone 
NT2RP2004041, weakly similar to 
SYNAPSINS lA AND IB. 


2268 


99 


208 


gil2052738 


Homo sapiens 


mRNA; cDNA DKFZp564H1322 
(from clone DKFZp564H1322); 
complete cds. 


2268 


99 




01*14^97199 

gl 1 / LJ,^ 




RP4-583P15 on chromosome 20 
Contains ESTs STSs GSSs and ten 
CpG islands. Contains the TNFRSF6B 
gene for tumor necrosis factor receptor 
6b (decoy), the 3' part of the 
K1AA1088 gene, the ARFRPl gene for 
ADP-ribosylation factor related protein 
1, two genes for novel proteins, the 
gene for a GLUT4 enhancer factor and 
die gene for a novel zinc finger protein 
similar to rat RIN ZF and the gene for a 
novel BTB/POZ domain containing 
zinc finger protein, complete sequence. 


2074 


99 


209 


gil3162677 


Homo sapiens 


GLUT4 enhancer factor mRNA» 
complete cds. 


2055 


98 


209 


gil2655101 


Homo sapiens 


clone IMAG£:3 140406, miRNA, 
partial cds. 


1766 


100 


210 


gn4279329 


Homo sapicus 


ubiquitin specific protease (USP28) 
mRNA, conq}lete cds. 


4131 


92 


210 


gi7959297 


Homo sapiens 


mRNA for KIAA1515 protein, partial 

cds. 


3872 


100 


210 


AAB31552 


Homo sapiens 


A human ubiquitin specific protease 25 
(USP25). 


2058 


48 


211 


AAB36579 


Homo salens 


Human FLEXHT-l protein sequence 
SEQIDNOrl. 


1829 


100 . 


211 


AAB94048 


Homo sapiens 


Human protein sequence S£Q ID 
NO:14211, 


1825 


99 


211 


gil0433984 


Homo sapiens 


CDNAFU12475 fis, clone 
NT2RM1000962. 


1825 


99 


212 


gil5824499 


Homo sapiens 


GalNAc-4-O-sulfotransferase 1 
mRNA, conflate cds. 


2238 


100 


212 


gil 1990885 


Homo sapiens 


GaINAc4ST mRNA for GalNAc 4- 
sulfotransferase, complete cds. 


2238 


100 


212 


gilS559803 


Homo sapiens 


carbohydrate (N-acetylgalactosamine 
4-0) sulfotransferase 8, clone 
MGC:20987 IMAGE:4635405, mRNA, 
complete cds. 


2238 


100 
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213 


AAB43387 


Homo sapiens 


Human ORFX ORP3151 polypeptide 
sequence SEQ ID NO:6302. 


1056 


100 


213 


gil5292317 


Drosophila 
melanogaster 


LD46863p 


549 


50 


213 


gi7302029 


Drosophila 
melanogaster 


CG12054 gene product 


549 


50 


214 


gil2843216 


Mus musculus 


putative 


913 


84 


214 


gil4585867 


Homo sapiens 


hypothetical protein SB 145 


297 


44 


214 


«14388386 


Macaca 
fascicularis 


hypothetical protein 


295 


44 








mRNA for KIAA0833 nrotein. nartial 
cds. 


7195 


99 








Huffiflin T)NA sifiAiifince inim clone 
RP3-467L1 on chromosome lp36.21- 
36.33. Contains the 3* part of gene 
KIAA0833, the VAMP3 gene for 
vesicle-associated membrane protein 3 
(cellubrevin), the PER3 gene for period 
(Drosophila) homolog 3 and the gene 
for urotensin n. Contains two putative 
Q>G islands, ESTs, STSs and GSSs, 
complete sequence. 


3642 


99 


215 


AAB42729 


Homo sapiens 


Human ORFX ORF2493 polypeptide 
sequence SEQ ID NO:4986. 


997 


54 


216 


gi7293088 


Drosophila 

melanogaster 


CG9213 gene product 


811 


30 


216 


gil5810333 


Arabidopsis 
thaliana 


unknown protein 


713 


28 


216 


gil3324888 


Caenorhabditis 
elegans 


Hypothetical protein B0361 .2 


710 


34 


217 


gi2443331 


Xenopus 
laevis 


Nfrl 


2421 


75 


217 


AAB34944 


Homo ss^iens 


Human secreted protein sequence 
encoded by gene 20 SEQ ID NO:148. 


1129 


91 


217 


gil5292543 


Drosophila 
melanogaster 


SD06560p 


911 


36 


218 


gi7243111 


Homo sapiens 


mEtNA for KIAA1365 protein, partial 

cds. 


3855 


100 


218 


gil657758 


Rattus 
norvegicus 


densin-180 


3640 


93 


218 


gi8570180 


Rattus 
norvegicus 


densin-1 80 variant D 


1250 


83 


219 


gil4017839 


Homo sapiens 


mRNA for KIAA18 1 1 protein, partial 
cds. 


1726 


80 


219 


gi3217028 


Homo sapiens 


mRNA for putative serine/threonine 
protein kinase, partial. 


1450 


84 


219 


gi7294217 


Drosophila 
melanogaster 


CG61 14 gene product 


1055 


70 


220 


gi7297674 


Drosophila 
melanogastor 


CG13139 gene product 


942 


75 


220 


Eil2857050 


Mus musculus 


putative 


767 


62 


220 


gil5636900 


Gallus gallus 


avEna neural variant 


139 


52 


221 


fiil5489242 


Homo sapiens 


clone IMAGE:3859726, mRNA. 


1001 


88 
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partial cds. 






221 


gil3543991 


Homo sapiens 


clone IMAGE:3627860, mRNA, 
partial cds. 


1001 


88 


221 


gil2847182 


Mus musculus 


putative 


328 


39 


222 


gil4133209 


Homo sapiens 


mRNA for KIAA0654 protein, partial 
cds. 


6089 


99 


222 


gi930343 


Homo sapiens 


Human LAR-interacting protein lb 
mRNA, complete cds. 


3559 


60 


272 




HvIDU SapWuS 


nuiimD. Lrf/\j\.~micraciujg pFoceizi is 




Ov 


223 


gil2620207 


Homo sapiens 


Clorf25 mRNA, complete cds. 


3807 


98 






jiomu Sapieiis 


Hunfian DNA sec|uence from clone 
GSM20K12 on chromosome lq25.3- 

H 1 2 ^rtntsaiTic fnr* o^n^ ^r\r r^na firmer* 
VyUXlUtUla lliC gvliC lUl liXig iUl^Ci 

protein DING or BAP-1, an FTHl 

pseudogene, the 3' end of the gene for a 
novel protein similar to archaeal, yeast 
and worm N2,N2-dime&ylguanosine 
tRNA methyltransfezase, ESTs, STSs, 
GSSs and two putative CpG islands, 
con^lete sequence. 




Oft 


111 


gil2835704 


Mus musculus 


putative 


1420 


88 


224 


gil4595658 


Xenopus 
laevis 


LIM protein prickle 


2865 


67 


224 


gil0727796 


Drosophila 
melanogaster 


esn gene product 


698 


42 


224 


gi6634Q92 


Drosophila 
melanogaster 


LIM«<]oniain protein 


698 


42 


225 


gil3375149 


Homo s^iens 


Human DNA sequence from clone 
RP5-1 1 18M15 on chromosome 20 
Contains part of a gene similar to PI 4 
Bos taurus (P14L), a novel gene, ESTs, 
STSs, GSSs and a CpG Island, 
complete sequence. 


957 


99 


225 


gi7259265 


Mus musculus 


contains transmembrane (TM) region 


314 


50 


225 


AAY53871 


Homo sapiens 


A human brain-derived signalling 
factor polypeptide. 


299 


45 


226 


gil2803987 


Homo sapiens 


clone MGC:4174 IMAGE:3634226, 
mRNA, conQ)lete cds. 


743 


100 


226 


gil2805417 


Mus musculus 


Unknown (protein for MGC:7354) 


444 


66 


226 


gil2849498 


Mus musculus 


putative 


235 


72 


in 


AAy91629 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 23 SEQ ID NO:302. 


1391 


87 


111 


gi7677403 


Homo sapiens 


F-box protem FBG2 {FBG2) mRNA, 
complete cds. 


1391 


87 


in 


AAy83046 


Homo sapiens 


F-box protein FBP-6. 


1333 


82 


lis 


gil5079958 


Homo sapiens 


chromosome 1 1 open reading frame 
24, clone MGC:19741 
IMAGE:3614861,mRNA, complete 

cds. 


2231 


99 


228 


gil 1527205 


Homo sapiens 


DM4E3 (CI lorf24) mRNA, conqjlete 
cds. 


2224 


99 
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228 


AAB1896S 


Homo sapiens 


Amino acid sequence of a human 
transmembrane protein. 


2055 


99 


229 


gil5930199 


Homo sapiens 


Similar to RIKEN cDNA 4921523118 
gene, clone MGC:9467 
IMAGE:3914747, mRNA, con^jlete 
cds. 


1451 


99 


229 


gil3278594 


Mus musculus 


RKEN cDNA 4921523118 gene 


1440 


97 


229 


gil2856904 


Mus musculus 


putative 


1440 


97 


230 


gil5680131 


Homo sapiens 


hypothetical protein FU12171, clone 
MGC:19889 IMAGE :4652087, mRNA, 
complete cds. 


1638 


100 


230 


gil4043242 


Homo sapiens 


hypothetical protein FU 12171, clone 
MGC:15694 IMAGE:3351601, mRNA, 
complete cds. 


1638 


100 


230 


AAB93912 


Homo sapiens 


Himaan protein sequence SEQ ID 
NO: 13880. 


1634 


99 


231 


AAB56947 


Homo sapiens 


Human prostate cancer antigen protein 
sequence SEQ ID NO: 1525. 


779 


100 


231 


AAB68408 


Homo sapiens 


Amino acid sequence of a human 
NOVl polypeptide. 


574 


100 


231 


AAY81695 


Homo sapiens 


Human PTN protein sequence. 


574 


100 


232 


gill 138034 


Homo sapiens 


mRNA for KIAAl 173 protein, 
complete cds. 


2665 


100 


232 


AAG89259 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
379. 


2654 


99 


232 


gil2834372 


Mus musculus 


putative 


2427 


90 


233 


AAB98612 


Homo sapiens 


Human tumour suppressor gene, 
TSGl 6, protein. 


1706 


55 


233 


gill596412 


Homo sapiens 


GAC-1 (GAC-1) mRNA, complete cds. 


893 


77 


233 


gi4240237 


Homo sapiens 


mRNA for KIAA0874 protein, partial 
cds. 


893 


77 


234 


AA&41108 


Homo sapiens 


Human ORFX ORF872 polypeptide 
sequence SEQ ID NO:1744. 


4170 


99 


234 


gi6331287 


Homo sapiens 


mRNA for KIAA1274 protein, partial 
cds. 


3936 


99 


234 


gil545959 


Mus musculus 


paladin 


3560 


80 


235 


gi9368849 


Homo sapiens 


mRNA; cDNA DKFZp761G21 13 
(from clone DKFZp761G2113). 


972 


99 


235 


gi7293878 


Diosq>hila 
melanogaster 


CG13379 gene product 


274 


36 


235 


gil4532482 


Arabidopsis 
thflliana 


AT5g58570/knznl_20 


152 


31 


236 


gi3242242 


Mus musculus 


hypeipolarization-activated cation 
channel, HAC2 


4309 


91 


236 


gi7407645 


Rattus 

norvegicus 


hyperpolarization-activated, cyclic 
nucleotide-gated potassium channel 1 


4306 


91 


236 


gi2708316 


Mus musculus 


brain cyclic nucleotide gated 1; Bcng- 
1; brain specific ion channel protein 


4301 


91 


237 


AAB13370 


Homo sapiens 


Human brain-associated protein 
HBAP.l. 


1055 


100 


237 


gi9944291 


Homo sapiens 


TTYHl mRNA, complete cds. 


1055 


100 


237 


gi9651109 


Macaca 
fascicularis 


TTYHl 


1032 


98 
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238 


AAU00476 


Homo sapiens 


Human INTERCEPT 400 protein. 


1428 


100 


238 


AAY79266 


Homo sapiens 


Human elongase homologue HS3. 


1428 


100 


238 


AAB29648 


Homo sapiens 


Human membrane-associated protein 
HUMAP-5. 


1428 


100 


239 


AAB84885 


Homo sapiens 


Human protein, SEQ ID 14. 


4029 


99 


239 


AAB84882 


Homo sapiens 


Human protein, SEQ ID 6. 


4029 


99 


239 


gi5262593 


Homo sapiens 


mRNA; cDNA DKFZp434N093 (from 
clone DKFZp434N093); partial cds. 


3684 


99 


240 


gil3477247 


Homo sapiens 


Similar to RKEN cDNA 
5031400M07 gene, clone MGC:13079 
IMAGE:3840918, mRNA, con^lete 
cds. 


2153 


100 


240 


AAB18987 


Homo sapiens 


Amino acid sequence of a human 
transmembrane protein. 


2148 


99 


240 


gi7670425 


Mus musculus 


unnamed protein product 


1904 


89 


241 


AAG63222 


Homo sapiens 


Amino acid sequence of a human lipid 
metabolism enzyme. 


2194 


100 


241 


gil4861069 


Mus musculus 


phosphatidyl inositol phosphate kinase 
type n gamma 


2120 


95 


241 


gi3387798 


Rattus 
norvegicus 


pbospbatidylinositol 5-pbospbate 4- 
kinase gamma 


2087 


95 


242 


gi7295732 


Drosopbila 
melanogaster 


ft gene product 


2915 


39 


242 


gil57409 


Drosophlla 
melanogaster 


fat protein 


2901 


39 


242 


gil0727403 


Drosopbila 
melanogaster 


ds gene product 


2236 


34 


243 


AAF90315 aa 
2 


Homo sapiens 


Winged helix/zinc finger transcription 
factor FOXPlcDNA. 


819 


98 


243 


AAB82339 


Homo sapiens 


Winged helix/zinc fuiger transcription 
factor FOXPl. 


819 


98 


243 


gil2043714 


Homo sapiens 


clone pAB195 FOXPl (FOXPl) 
mRNA, complete cds. 


819 


98 


244 


gil0440073 


Homo sapiens 


cDNA: FLJ23399 fis. clone HEP18254. 


2620 


100 


244 


gi7018524 


Homo sapiens 


mRNA; cDNA DKFZp762K137 (from 
clone DKFZp762K137); partial cds. 


2524 


100 


244 


gil4133227 


Homo sapiens 


mRNA for KIAA0970 protein, partial 
cds. 


1367 


51 


245 


AAB94855 


Homo sapiens 


Human protein sequence SEQ ID 
NO:16042. 


1347 


100 


245 


gil0436290 


Homo sapiens 


CDNAFU13968 fis, clone 
Y79AA1001493, weakly similar to 
UBIQUITIN^ONJUGATING 
ENZYME E2-17 KD 9 (EC 6.3.2.19). 


1347 


100 


245 


gil6198439 


Homo sapiens 


hypothetical protein FU13855, clone 
MGC:16842 IMAGE:39 15698, mRNA, 
complete cds. 


1347 


100 


246 


gi6330302 


Homo sapiens 


mRNA for KIAAl 185 protein, partial 
cds. 


2041 


100 


246 


AAG74603 


Homo sapiens 


Human colon cancer antigen protein 
SEQIDNO:5367. 


1530 


97 


246 


AAB53321 


Homo sapiens 


Human colon cancer antigen protein 
sequence SEO ID NO:861 . 


1530 


97 
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247 


gi535390 


Macronuclear 
Homo sapiens 


Human cellular retinol binding protein 
H (CRBPH) mRNA, complete cds. 


715 


99 


247 


gi397352 


Mus musculus 


mCRBPn 


674 


91 


247 


gil2833902 


Mus musculus 


putative 


669 


90 


248 


AAG01285 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
5366. 


209 


87 


248 


AAR05562 


Homo sapiens 


Laminin -binding protein encoded by 
insert from J9 lan^da gtlO phage. 


209 


87 


248 


gil 149509 


Galhis gallus 


37kD Laminin receptor precursor ^0 
ribosomal associated protein 


209 


87 


249 


gil3162226 


Hoxno sapiens 


Human DNA sequence from clone 
RP4-543J19 on chromosome 20 
Contains part of the GNASl gene 
encoding guanine nucleotide binding 
protein (G protein, alpha stimulating 
activity polypeptide 1) including 
neuroendocrine secretory protein 55 
(NESP55), me CTSZA gene encodmg 
cathepsin Z, the ATP5E gene encoding 
ATP synthase (H+ transporting, 
mitochondrial Fl conqjlex, epsilon 
subunit), the gene encoding protein 
HSPC130 (THI Drosophila homolog), 
the gene for tubulin beta 1 class VI 
(TUBBl), a gene encoding tiie CGI- 
107 protein (LOC51012), four CpG 
islands, ESTs, STSs and GSSs, 
conplete sequence. 


1591 


100 


249 


gil 1230445 


Homo sapiens 


TUBBl gene for human beta tubulin 1, 
class VI. 


1591 


100 


249 


gi212834 


Gallus gallus 


beta-tubulin 


1340 


85 


250 


gil3162226 


Homo sapiens 


Htmian DNA sequence from clone 
RP4-543J19 on chromosome 20 
Contains part of the GNASl gene 
encoding guanine nucleotide binding 
protein (G protein, alpha stimulating 
activity polypeptide 1) including 
neuroendocrine secretory protein 55 

the ul2>^A gene encodmg 
csfheosin Z the A'l'P^P pctip etimdiiKy 
ATP synthase (H+ transporting, 
mitochondrial Fl conq>lex, epsilon 
subunit), the gene encoding protein 
HSPC130 (THI Drosophila homolog), 
the gene for tubulin beta 1 class VI 
(TUBB 1), a gene encoding the CGI- 
107 protein (L0C5 1012), four CpG 
islands, ESTs, STSs and GSSs, 
complete sequence. 


1986 


100 


250 


gil 1230445 


Homo sapiens 


TUBB 1 gene for human beta tubulin 1 , 
class VI. 


1986 


100 


250 


gi212834 


Gallus gallus 


beta-tubulin 


1699 


85 


251 


gi559325 


Homo sapiens 


Human mRNA for ATP synthase alpha 


1566 


99 
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subunit, complete cds. 






251 


gi559317 


Homo sdpiens 


Human ?ene for ATP svntfaase alnha 
subunit, complete cds (exon 1 to 12). 


1566 


99 


251 


gi34468 


Homo sflniens 


H *!anien<i ml^N A for mitnclinTulTial 
ATP svnthase 


1566 


90 


252 


gi559325 


Homo sapiens 


Human mRNA for ATP synthase alpha 
subunit, complete cds. 


2192 


S4 


252 


01550317 


TTrkunn c!itii<>tic 


T-TiiTTtJin o^np fAi* ATP cvntliac^ altilia 

subunit, complete cds (exon 1 to 12). 




OH 


252 




ff/imn c {inline 
XliiUJU oa^XCUa 


xl.oapiClia LLUviN A XUi LQlli/krJlUUUllal 

ATP synthase. 


9107 




253 


oi1455050R 


rTATTin GSim^Tic 


^/^G^•24^50 tmagp* 20^54574 ihRna 

con^lete cds. 




inn 


253 


gil5928691 


Mus musculus 


Unknown (protein for MGC: 1 9394) 


1036 


98 


253 


gi7293133 . 


Drosopliila 

melanogaster 


CG8974 gene product 


608 


66 


254 


AAF048R0 




Hitman rirntMiip T»TfttpiTi_7 

llUXLUtll {JXUlCooC pXUiCUl'/ Xvl O** / ^. 




inn 


254 


gil4043577 


Homo sapiens 


hypothetical protein FU 12455, clone 
MGC:13149 IMAGE:4298740, mRNA, 
complete cds. 


2795 


100 


254 


AAB94023 


Homo sapiens 


Human protein sequence S£Q ID 
NO:14157. 


2781 


99 


255 


gi2501855 


Homo sapiens 


22 kDa actin-binding protein (SM22) 
gene, complete cds. 


937 


95 


255 . 


gi2340833 


Homo sapiens 


DNA for SM22 alpha, complete cds. 


937 


95 


255 


gi2335047 


Homo sapiens 


mRNA for SM22 alpha, con^lete cds. 


937 


95 




m 1 <AQAO/V>1 


Homo sapiens 


similar to prokaiyotic«type class I 
peptide chain release factors, clone 
MGC:20261 IMAGE:3029407, mRNA, 
complete cds. 


1948 


99 




glO/UOOjO 


Homo sapiens 


Human DNA sequence £rom clone 
RPI-IOIKIO on chromosome 6q25-26. 
Contains a novel gene, fhe gene for a 
novel protein similar to Prokaiyotic- 
type class I peptide chain release 
factors, the 3* end of gene RGS17 
(RGSZ2) for regulator of G-protein 
signaling 17, ESTs, STSs, GSSs and 
two Dutative CoG islands comnlete 
sequence. 


1940 


99 


256 


gil5680165 


Homo sapiens 


similar to prokaiyotic-type class I 
peptide chain release factors, clone 
MGC:20252 IMAGE :464 6472, mRNA, 
complete cds. 


1375 


98 


257 


gil5080204 


Homo sapiens 


similar to prokaryotic-type class I 
peptide chain release factors, clone 
MGC:20261 EVIAGE:3029407, mRNA, 
complete cds. 


1706 


90 


257 


gi6706658 


Homo sapienis 


Human DNA sequence from clone 
RPI-IOIKIO on chromosome 6q25-26. 
Contains a novel gene, the gene for a 
novel protein similar to Prokaryotic- 


1698 


89 
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type cidss 1 pcpnac cnaui reicasc 
factors, the 3' end of gene RGS17 

signaling 17, ESTs, STSs, GSSs and 

sequence. 






257 


£115680165 


Homo sapiens 


similar to prokaiyotic-type class I 
peptide chain release &ctoi5> clone 
MGC:20252 IMAGE:4646472, mRNA, 

(^OUipiCIC COS. 


1133 


85 






melanogaster 


KAJ'tWMJ gCUC prOQUvl 


OiD 


At 


258 


gil2322327 


Arabidopsis 
thaliana 


unknown protein 


451 


46 






Arabidopsis 


Unknown protein 


4d1 


40 






HoiDo sapiens 


Human piotein sequence SEQ ID 
NO: 17548. 


<A1 1 


100 






Homo sapiens 


cuiNA ru 14 /Hi) ns. Clone 

IN X Zlvir Jl/UZOUZ, WCaKiy Similar lO 

PROBABLE PROTEIN DISULFIDE 
(EC 5.3.4.1). 


0011 


1 AA 
lUO 


259 


eil5862252 




iithiatiiaH nrntpin nrf\Hiirt 




00 


260 


gil5(r79416 


Homo sapiens 


secreted modular calcium-binding 
nrotein 1 clone MGP" 1 
IMAGE:4549051, mRNA, complete 
cds. 


2359 


100 


260 


AAB19394 


Homo sapiens 


Amino acid sequence of a human 
secreted protein. 


2355 


99 


260 


gil0432431 


Homo sapiens 


mRNA for secreted modular calciumr 
binding protein (smocl gene)* 


2343 


99 


261 


gi7020475 


Homo sapiens 


cDNA FU20400 fis, clone KAT00587. 


1687 


100 


261 


gill 18097 


Caenorhabditis 
eiecans 


proline and glycine^rich 


268 


33 


261 


AAW49723 


Homo sapiens 


Protein polymer adhesive substrate 
PPASl-F. 


261 


32 


262 


gil6197949 


Drosophila 
melanogaster 


LD21896p 


325 


29 


262 


gi7293303 


Drosophila 
melanogaster 


CG90S9 gene product 


325 


29 


262 


gi3170539 


Takiiiigu 
nibripes 


unknown 


291 


40 


263 


AAB42525 


Homo sapiens 


Human ORFX ORF2289 polypeptide 
sequence SEQ ID NO:4578. 


3570 


80 


263 


gi2887497 


Homo sapiens 


chromosome 19, overlapping cosmids 
R28707 and R34001, conq)lete 
sequence. 


3570 


80 


263 


AAB42538 


Homo sapiens 


Human ORFX ORF2302 polypeptide 
sequence SEQ ID NO:4604. 


2835 


99 


264 


gil4017849 


Homo sapiens 


mRNA for KIAA1816 protein, partial 
cds. 


1637 


99 


264 


gi8655687 


Homo sapiens 


mRNA; cDNA DKFZp762E1511 


892 


100 



166 



wo 02/081731 



PCTAJS02/01222 



Table 2 



SEQIDNO: 


Accession No. 


Species 


Description 


Score 


% 
Identity 








(from clone DKFZp762E15 1 1). 






264 


gi6979930 


Homo sapiens 


Maml mRNA, partial cds. 


315 


30 


265 


gil2836420 


Mus musculus 


putative 


2511 


93 


265 


gil0437002 


Homo sapiens 


cDNA; FU21013 fis, done 
CAE05223. 


1859 


99 


265 


AAB58385 


Homo sapiens 


Lung cancer associated polypeptide 
sequence SEQ ID 723. 


1704 


99 


266 


gil4198321 


Mus musculus 


ribosomal protein L3 1 


543 


92 


266 


&57115 


Rattus 
norvegicus 


ribosomaj protein L31 (AA 1-125) 


543 


92 


266 


gil4586963 


Mus musculus 


M75 


543 


92 


267 


gil78424 


Homo sapiens 


Human apolipoprotein A-U mRNA, 
complete cds. 


478 


96 


267 


gi296634 


Homo sapiens 


Human gene for apolipoprotein AD. 


478 


96 


267 


gi296633 


Homo sapiens 


Human DNA for aDoliDonrotein A-TT 


478 




268 


AAB47184 


Homo sapiens 


ACPLX nroteiii seauence 


J J 1 1 


inn 


268 


gi7321168 


Homo sapiens 


Human DNA sequence from clone 
RP5-860F19 on chromosome 20pl2.3- 
13 Contains the gene for KIAA1442 
(similar to olfactory neuronal 
transcription factors (COEl, C0E2, 
C0E3, EBF3, OLFl)), RPL19 (60S 
ribosomal protein L19) and HSPC080 
pseudogenes, the gene for 
metallocarboxypeptidase (CPX-I) and 
a novel gene. Contains ESTs, STSs, 
GSSs and four CpG islands, con^lete 
sequence. 


3571 


100 


268 


AAB36174 


Homo sapiens 


Human APG04 protein. 


3567 


99 


269 


gi23 14829 


Homo sapiens 


jQTky gene product homolog mRNA, 
complete cds. 


1430 


59 


269 


gil0140857 


Mus musculus 


jerky 


752 


33 


269 


AAG62624 


Homo sapiens 


Human cell nucleus regulatory protein 
56. 


598 


34 


270 


gi7959227 


Homo sapiens 


mRNA for KIAA1483 protein, partial 
cds. 


2231 


99 


270 


gi34192 


Homo sapiens 


Human KUP mRNA for protein with 
two zinc fmgers. 


627 


39 


270 


gil33 10782 


Mus musculus 


myoneurin 


315 


24 


271 


AAB93814 


Homo sapiens 


Human protem sequence SEQ ID 
NO:13604. 


1408 


97 


271 


gil0433080 


Homo sapiens 


CDNAFU11753 fis, clone 
HEMBA1005583. 


1408 


97 


271 


AAB41771 


Homo sapiens 


Human ORFX ORF1535 polypq)tide 
sequence SEQ ID NO:3070, 


821 


99 


272 


gi7959197 


Homo sapiens 


mRNA for KIAA1468 protein, partial 
cds. 


4603 


100 


272 


gil5080502 


Homo sapiens 


clone MGC:16944 IMAG£:4339646, 
mRNA, conq>lete cds. 


4317 


94 


272 


gi9755831 


Arabidopsis 
fhaliana 


putative protein 


675 


27 


273 


gil5080502 


Homo sapiens 


clone MGC: 16944 IMAGE:4339646, 
ihRNA, complete cds. 


4362 


98 
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273 


gi7959197 


Homo s&piens 


mRNA for KIAA1468 orotein. oartial 
cds. 


4360 


96 


273 


gi9755831 


Arabidopsis 
thaliana 


putative protein 


704 


28 


274 


AAB92483 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 10570. 


2626 


100 


274 


gi7021875 


Homo sapiens 


cDNAFU10051fis, clone 
HEMBAl 001281 


2626 


100 


274 


gil2837616 


Mus musculus 


putative 


2065 


90 


275 


cil 07 16076 




protein, complete cds. 


273Q 


ion 


275 


0114043332 


TTofYin cfinipnc 
X1.UU1U oa|jicuo 


^imilsiT' tf\ TiTify fin OPT nmt'Pin 0^ dnnp 

MGC-2475 IMAGE-3051389 mRNA 
conq^lete cds. 






275 


cil07 16078 


l^ii*; TniiQf nine 


tpQriQ— jiHnnHjiTit fin cpt nTritpin 






276 


AAB44673 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 33 SEQ ED NO: 138. 


1014 


96 


276 


gil747 


Qiyctolagus 
cuniculus 


tzichohyalin 


213 


22 


216 


gil3936996 


Human 
herpesvirus 8 


ORF73 


203 


22 


277 


AAG74326 


Homo sapiens 


Human colon cancer antigen protein 
SEQIDNO:5090, 


1101 


100 


277 


AAB56461 


Homo sapiens 


Human prostate cancer antigen protein 
sequence SEQ ID NO: 1039. 


11% 


100 


111 


gil2842930 


Mus musculus 


putative 


688 


90 


2 to 


gliUzU14D 


Homo sapiens 


Human DNA omding protem (HPF2) 
miRNA, complete cds. 


1528 


47 


07ft 

J^to 




Homo sapiens 


Human DNA sequence from clone 
RP1-54B20 on chromosome Xpl 1.1- 
1 1.3. Contams the 5' end of a novel 
SSX family protein gene, two novel 
r^Jsj\D dua comaiTiifig ^zxx^ vP^ 2uic 

finger protein genes, a KRAB box 

nrntpin rtQpnrlnopnp tVip opnp fnr 9 

novel protein similar to lysozyme C 
^1 4*beta-N-acetvlTnuT:amif)acp^ tlie 
ZNF81 gene for zinc finger protein 81 
(HFZ20), ESTs, STSs, GSSs and three 
CpG islands, complete sequence. 


1497 


55 


278 


gi4981S2 


Homo sapiens 


Human mRNA for KIAA0065 gene, 
partial cds. 


1495 


46 


279 


gi2914676 


Homo s^iens 


chromosome 16, cosmid clone 360H6 
(LANL), conq}lete sequence. 


882 


35 


279 


gil4250678 


Homo sapiens 


clone MGC:10489 IMAGE:3945548, 
mRNA, complete cds. 


882 


35 


279 


gi2342506 


Homo sapiens 


mRNA for zinc finger protein FPM3 1 5, 
con^lete cds. 


875 


35 


280 


gi434779 


Homo sapiens 


Human mRNA for KIAAOl 12 gene, 
partial cds. 


2072 


100 


280 


gil5278392 


Homo sapiens 


homolog of yeast ribosome biogenesis 
regulatory protein RRSl, clone 
MGC;4831 IMAGE;3603972. mRNA. 


1905 


100 
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complete cds. 






280 


gil2804751 


Homo sapiens 


Similar to regulator for ribosome 
resistance homolog (S. cerevisiae), 
clone MGC:2755 IMAGE:2824034, 
mRNA, complete cds. 


1905 


100 


281 


AAB95761 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 18686. 


789 


100 


281 


AAG81272 


Homo sapiens 


Human AFP protein sequence SEQ ID 
NO:62. 


789 


100 


281 


gil4035852 


Homo sapiens 


unnamed protein product 


789 


100 


282 


oil 5080911 




nen-nolvf nol vmerase mRNA 

conq)lete cds. 


3797 


99 


282 


gil5384858 


Homo sapiens 


mRNA for poly(A) polymerase gamma 


3797 


99 


282 


gil3641252 


Homo sapiens 


SRP RNA 3' adenylating enzyme/pap2 
mRNA, complete cds. 


3779 


99 








(from clone DKFZp434A1014); partial 

cds. 


1437 


85 


283 


gil2853788 


Mus musculus 


putative 


408 


38 


283 


pi4468790 


laevis 


cn#*i*f)v nrAtPin 


154 


26 


284 


gi3327062 


Homo sapiens 


mRNA for KIAA0624 protein, partial 
cds. 


10179 


99 


284 


gil3702612 


Staphylococcu 
s aureus subsp. 
aureus N3 15 


ORFID:SA2447~faypotbetical protein, 
similar to streptococcal hemagglutinin 
protein 


223 


19 


284 


gil4248429 


Staphylococcu 
s aureus subsp. 
aureus Mu50 


hypothetical protein 


223 


19 


285 


gil2697941 


Homo sapiens 


mRNA for KIAA1698 protein, partial 
cds. 


4716 


100 


285 


gi7299794 


Drosophila 
melanogaster 


CG9591 gene product 


290 


31 


285 


AAR99256 






09 


*f\f 


286 


AAG62395 


Homo sapiens 


Human zinc finger protein 46. 


2375 


100 






xiomo sspicus 


numaii Li IN A sequence jrom ciouc 
RP11-393J16 on chromosome 10. 

zinc finger protein 33a (KOX 31), a 
novel gene for a novel KRAB box 
containing zinc finger gene, a zinc 
finger pseudogene, ESTs, STSs, GSSs 
and two putative CpG islands, coit^lete 
sequence. 






286 


gi881564 


Homo sapiens 


Human zinc finger containing protein 
ZNF157 (ZNF157) mRNA, complete 
cds. 


1339 


51 


287 


gi2822143 


Homo sapiens 


chromosome 19, cosmid R30217, 
complete sequence. 


1838 


53 


287 


gi9968290 


Homo sapiens 


mRNA for zmc finger protein (2^NF304 
gene). 


1735 


50 


287 


gil3543419 


Homo sapiens 


Similar to zinc finger protein 304, 


1735 


51 
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clone MGC:4079 IMAGE:3530863, 
mRNA, complete cds. 






288 


gi540469 


Homo s aniens 


( clone HGT26^ T cell recentnr ^amnia* 
chain mRNA, V region. 


399 


91 


288 


ci3 047024 


Homo sanien<% 


X-cell recentor gamma VI pene reiHon 


384 


100 


288 


ei339167 


Homo saniens 


Hmnan X-cell recentor Tearranffed 
gamtnfl~clTain Eene V-refiion rV4^ 
(subgroup I). 


384 


100 


289 


AAY69976 


Homo saniens 


DHFR-HM nrotefn 


886 


93 


289 


sil82724 


Homn saniens 




886 


93 
✓J 


289 


gil82717 


Homo saniens 


Hnman HihvdrofrilatP rAHnctacR oftnp 

XXmilOl 1 IXUAJrUlLFXvllCItw lMj.UwUiab ^wUVy 

exon 6 and 3' flank. 


886 


93 


290 


AAE01782 


Homo saiiipn<: 


Human cene 1*^ encnHeii ^ecT^ted 
protein HDPNW93, SEQ ID NO:103. 


4269 


99 


290 


eil0437433 


Homo saniens 


cDNA' FLJ21347 fis clone 
COL02724. 


4127 


97 


290 


AAB74693 


Homo sapiens 


Human protease and protease inhibitor 
PPIM-26. 


3948 


99 


291 




A/fiic vniicpiiliiG 




OSS 


on 


291 


gil2844277 


Mus musculus 


putative 


800 


79 


291 




JtlUUlU oapiCIla 


riuiiiaii D CO 1 sccrcicu proism oci^ 
NO:541. 


OHO 


00 


292 






FPTPil 


970R 




292 


gil5141735 


Homo sapiens 


unnamed protein product 


2798 


98 


2Q2 




XxUIXlU aaJJiCXid 


miviN/v lor ciiroiijosoniC iz open 


Old 




293 


gil0440367 


Homo sapiens 


mRNA for FLJOOO 18 protein, partial 
cds 


5938 


100 


293 


gil5488570 


Homo sapiens 


Similar to hypothetical protein 
FO00018 clone MGC* 10073 
IMAG£:3896004, mRNA, complete 
cds. 


4736 


99 


293 


gil0438857 


Homo sapiens 


cDNA: FU22458 fis, clone 
HRCIOOOI. 


1570 


99 


294 


AAB08948 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 21 SEQ ID NO: 105. 


1601 


99 


294 


AAB08911 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 21 SEQ ID NO:68. 


1601 


99 


294 


AAB80238 


Homo sapiens 


Human PR0238 protein. 


641 


44 


295 


AAB18457 


Homo sapiens 


A human TANGO 2 1 6 polypeptide 
clone. 


2106 


98 


295 


AAB18447 


Homo sapiens 


Amino acid sequence of human 
TANGO 216 polypeptide. 


2106 


98 


295 


gil4017381 


Homo sapiens 


tumor endolhelial marker 8 precursor 
(TEM8) mRNA, complete cds. 


1231 


57 


296 


gil4388342 


Macaca 
fascicularis 


hypothetical protein 


3833 


92 


296 


gi7243195 


Homo sapiens 


mRNA for KIAA1407 protein, partial 
cds. 


3817 


100 


296 


gil5451319 


Macaca 
fascicularis 


hypothetical protein 


2408 


91 


297 


gi7243039 


Homo sapiens 


mRNA forKIAA1329 protein, partial 
cds. 


4761 


100 
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297 


gil2007720 


Mus musculus 


VPS 10 domain receptor protein 
oorv^dZ 


4466 


88 


297 


gi7715916 


Mus musculus 


SoiCSb splice variant of the VPSIO 
domain receptor SoiCS 


2177 


47 




AAMUO0I2 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 175. 


1488 


f\f\ 
99 


298 


gil2846045 


Mus musculus 


putative 


1387 


65 


298 


AAM00925 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 401. 


996 


100 


299 


gi7298852 


Drosophiia 
melanogaster 


CG10068 gene product 


609 


43 


299 


gi8655669 


Homo sapiens 


mRNA; cDNA DKFZp547C176 (from 
clone DKFZp547Cl 76), 


482 


52 


299 


AAB42048 


Homo sapiens 


Human ORFX ORF1812 polypeptide 
sequence SEQ ID NO:3624. 


325 


46 


300 


gil4043285 


Homo sapiens 


Similar to KIAA0808 gene product, 
clone MGC:15SS0 IMAGE:3529159, 
mRNA, complete cds. 


1306 


97 


300 


gi7263912 


Homo sapiens 


Human DNA sequence from clone 
RP5-1 108D1 1 on chromosome 20ql2- 
13.1 1 Contains part of the gene for a 
novel protein similar to C. elegans 
T22C1.7, part of the gene for a novel 
HMG (high mobility group) box 
protein similar to KIAA0737, 
KIAA0808 and TNRC9 (CAGF9), 
ESTs, STSs, GSSs and two putative 
CpG islands, complete sequence. 


797 


96 


300 


gi3882337 


Homo sapiens 


mRNA for KIAA0808 protein, 
complete cds. 


767 


55 


301 


gil5430292 


Homo sapiens 


muscle alpha-kinase (MAK) mRNA, 
complete cds. 


5445 


99 


mi 


gl/Z43U41 


Homo sapiens 


mRNA for KIAA1330 protem, partial 
cds. 


4933 


100 


DM I 




Mus musculus 


myocytic induction/difTerentiation 
originator 


3684 


72 


302 


gil4550508 


Honu) sapiens 


Similar to CG8974 gene product, clone 
MGC:2460 IMAGE:2964524, mRNA, 
con^lete cds. 


589 


100 


302 




iiaus zuuscuius 


unimown (^procem lor JW-Oi^. i yjy*t ) 






302 


gi2564951 


Mus musculus 


unknown 


378 


72 


303 


gi7242955 


Homo sapiens 


mRNA for KIAA1300 protein, partial 
cds. 


9573 


99 


303 


gi6599162 


Homo sapiens 


mRNA; cDNA DKFZp434N1272 
(from clone DKFZp434N1272); partial 
cds. 


1392 


98 


303 


AAG75083 


Homo sapiens 


Human colon cancer antigen protein 
SEQIDNO;5847. 


628 


92 


304 


gil408209 


Homo sapiens 


Human endogenous retrovirus HERV- 
K(HML6) proviral clone HML6.17 
putative polymerase and envelope 
genes, partial cds, and 3'LTR. 


398 


86 


304 


gi2801455 


Mouse 


Prl60 


176 


48 
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mammaiy 
tumor virus 








IfiA 




Exogenous 
mouse 
TsisxoxDsry 
tumor virus 


f^f%^ T)<->1 

Oag-rro-i'oi 


I/O 


4o 






Homo sapiens 


unconventional myosin IG valine form 
(MYOIG) mRNA, MYOIG-V allele, 
partial cds. 


3269 


9o 


305 


gil4269504 


Homo sapiens 


unconventional myosin IG methonine 
lonn (MYOIG) mRNA, MYOIG-M 
allele, partial cds. 


3266 


97 


5\fj 




Rattus 
norvegicus 


myosin I 


3130 


57 


jVO 




Homo sapiens 


1 Ir-l mteractmg pepnae 20 mRNA, 
partial cds. 


20ol 


99 






Homo sapiens 


Human mKiNA lor J\iAAU3zo gene, 
partial cds. 


648 


39 






Homo sapiens 


Human zinc finger protein 224F135 
mRNA, con^lete cds. 


590 


40 


307 


gil3183883 


Homo sapiens 


PD-Migand 2 protein (PDL2) mRNA, 
conplete cds. . 


1417 


99 






Homo sapiens 


Du^opniim precursor u /-uc mKiN A, 
complete cds. 


1 >l 1 T 

1417 


99 


j\f r 




xiomo Sapiens 


Human gene 1 encoded secreted 
protein roPPA04, SEQ ID NO:74. 


1 y| K 
1410 


OA 






nouio aapicns 


iiuman gene encoaeo secrecea 
protein fragment, SEQ ID NO:177. 


JO J 


1 AA 




A AMARUS 


Jiouiu Sapiens 


numan proiem sequence oe\^ ixj 
NO:16072. 




1 AA 


308 


gil0436314 


Homo sapiens 


CDNAFU13984 fis, clone 
Y79AA1001846. 


383 


100 






jtiomo sapiens 


Human Rap2 amino acid sequence. 


2Uo 


33 


309 


gi4678734 


Homo sapi^ 


Human gene from PACs 37M17 and 
305B16, chromosome X, smular to 
small G proteins, especially RAP-2A. 


206 


33 


309 




rffkTvm com one 
XIUUIU SkapiCUo 


jiunjaii Done marrow proiem, ocv^ xmJ 
NO: 432. 


/UO 


dZ 


310 




aOUjU aapidlS 


nuiuan uusss/x jot i -ceij iccepior 
alpha-chain HAP50 V(a)8.2-J(a)M. 




1 AA 


310 


gil223888 


synthetic 
construct 


T cell receptor alpha chain 


586 


100 


310 


gi2358036 


Homo sapiens 


T-cell receptor a^ha delta locus from 
bases 250472 to 501670 (section 2 of 
5} of the Conplete Nucleotide 
Sequence. 


586 


100 


311 


AAE01596 


Homo sapiens 


Human gene 13 encoded secreted 
protein HCICJ15, SEQ ID NO;146. 


1066 


92 


311 


AAE04136 


Homo sapiens 


Human gene 6 encoded secreted 
protein HCLBW50, SEQ ID NO:123. 


1066 


92 


311 


gi31135 


Homo sapiens 


H.sapiens mRNA for elongation Dactor 
1-beta. 


1066 


92 


312 


gi7243137 


Homo sapiens 


mRNA for KIAA1378 protein, partial 


2400 


99 
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312 


gil23 14036 


Homo salens 


Human DNA sequence from clone 
rvrj*jojj*T uu cnromosome loz^. i- 
24.3 Contains part of a gene encoding a 
iiiuui ^uiiiaiuiiig uauiciH) pan oi a 
novel gene encoding a protein similar 

lu x\opalijrl~i JvlNA ayuiuciiisey a 

putative novel gene, a 40S nbosomal 
CpG islands, ESTs, STSs and GSSs, 


1184 


44 


312 


gi4650844 


Homo sapiens 


mRNA for Kelch motif containing 

Dfotein comnlptp cAv 


1176 


44 


313 


gi7019945 


Homo sapiens 


cDNA FU20079 fis. clone COL03057. 


1610 


83 


313 


gil2804721 


Homo sanien^s 


mRNA, complete cds. 


iZ/1 


4o 


313 


AAB43912 


Homo saoien*: 


seauence SEO TD NO*! "^^7 


iZjj 


AK. 

hD 


314 


AAB41414 


Homo sapiens 


Human ORFY ORFl 17R nn1vn#»r\tiHA 

sequence SEQ ID NO:2356. 




yi 


314 


gi6329897 


Hpmo sapiens 


mRNA for KIAAl 137 protein, partial 
cds. 


4798 


98 


314 


gil4043759 


Homo sapiens 


clone IMAGE:41 1 1596, mRNA, 
j>artial cds. 


3906 


98 


315 


AAB28375 


Homo sapiens 


Human hyperpolarisation-activated 
channel HAP^ 


3686 


99 


315 


gi7959337 


Homo sapiens 


mRNA for KIAA1535 protein, partial 
cds. 


3665 


99 


315 


gi3242244 


Mus musculus 


hyperpolarization-actiyated cation 
channel HA03 


3556 


96 


316 


gil4198399 


Mus musculus 


RIKEN cDNA 1500034J20 gene 


837 


93 


316 


gil2854536 


Mus musculus 


putative 


837 


93 


316 


gil4250857 


Homo sapiens 


Human DNA sequence from clone 

RP^.I 1 "^701 7 €tM\ rKmm/^cnmo 1 1«%1 0 

xvr>^*i u i\jx / UU Wriuoiiiusome i jpiz- 

14 >^ PontsllTKS nsirt' a <»Anf» cimilai* fe\ 
A^>^ v^wuiAius uall MX a kCXXC oUXlUcu lO 

putative mitochondrialninner 
membrane protease subnunit 2, a novel 
ihRNA. ESTs and GSSs comnWe 
sequence. 


775 


100 


317 


gil0439850 


Homo sapiens 


cDNA: FU23233 fis, clone 
CAS00458. 


1081 


50 


317 


gi9968290 


Homo sapiens 


mRNA for zinc finger protein (ZNF304 
gene). 


1039 


48 


317 


gil4249844 


Homo sapiens 


Similar to li3rpothetical protein 
FU23233, clone MGC:14876 
IMAGE:3544044, mRNA, conq)lete 

cds. 


1037 


47 


318 


gill 863686 


Mus musculus 


neurobeachin 


3371 


96 


318 


gil 1863539 


Gallus gallus 


neurobeachin 


2100 


89 


318 


AAB92596 


Homo sapiens 


Human protein sequence SEQ ID 
NO:10843. 


1721 


100 


319 


gil2698174 


Macaca 
fasciculazis 


hypothetical protein 


1221 


95 
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319 


gil0439153 


Homo sapiens 


cDNA: FU22672 fis. clone HSI09265. 


1085 


99 


319 


gi7020125 


Homo sapiens 


cDNA FU20190 fis, clone COLF0714. 


893 


50 


320 


gi2865219 


Homo sapiens 


integrin binding protein Del-1 (Dell) 
mRNA, complete cds. 


447 


100 


320 


AAW94685 


Homo sapiens 


Human Del-1 protein. 


438 


98 


320 


AAW10365 


Homo sapiens 


Human develottmentallv-reDiiIatpH 
endothelial cell locus- 1 protein. 


438 




321 


AAB27246 


Homo sapiens 


Human EXMAD-24 SEO ED NO- 24 


2047 


100 


321 


AAB42385 


Homo sapiens 


Human ORFX ORF2149 polypeptide 
sequence SEQ ID NO:4298. 


2047 


100 


321 


gi52998 


Mus musculus 


macrophage mannose receptor 
orecuTsor 


164 


31 


322 


gil2834087 


Nfus musculus 


nutative 




R9 


322 


gi2463628 


Homo sapiens 


HumflD nutative ruftnorarlifiYvlate 

transporter (MOT) mRNA, con^lete 
cds. 


506 




322 


gi2198807 


Gallus gallus 


monocaiboxylate transporter 3 


473 


27 


323 


8il5620909 


Homo sapiens 


mRNA for KIAA1925 protein, partial 
cds. 


1059 


38 


323 


AAB92496 


Homo sapiens 


Human protein sequence SEQ ID 
NO- 10598 


1050 


36 


323 


gi702I900 


Homo saniens 


cDNA FIJI 0065 fis clone 
HEMBA1001455. 






324 


gi9651075 


Nlacaca 
fascicularis 


unnatned nrotein nrniliirt' 




yj 


324 


gi 15 145795 


Sus scrofk 


naQin r>rf>liTip-ripTi nrrttfin 


999 




324 


gi59 17666 


Zeamays 


extensin-like protein 


195 


25 


325 


Bi7529597 


Tf nmn <!flnipn« 


Jill I nan ur^i\ ocuuchuc ironi ciouc 

RP3-402N21 on chromosome 6p21.1- 

Ox *W ^ontflific ifn ^rs HVitpP' ti/ii/aI o^n^c 
^i^»JX» v^vruuiuio Up iU IXIlCv lliJVCl gCUCo 

with MAM and immunoglobulin 
domains. Contains BSTs, STSs^ GSSs 
and four putative CpG islands, 
complete sequence. 


14/4 


1 An 


325 


gil2836077 


Mus musculus 


putative 


1365 


95 


325 


AAE00586 


Homo sapiens 


Human nuclear cell adhesion molecule 
homologue, NCAM d 2 protein. 


1303 


49 


326 


gil5278193 


Homo sapiens 


MAGMC beta mRNA, conq)lete cds, 
alternatively spliced. 


1492 


100 


326 


gi2702351 


Mus musculus 


putative membrane-associated 
guanylate kinase 1 


1112 


83 


326 


gi5817255 


Homo sapiens 


mRNA; cDNA DKFZp434B203 (from 
clone DKFZp434B203); partial cds. 


739 


100 


327 


AAB01432 


Homo sapiens 


Human TANGO 239 (form 2). 


3675 


99 


327 


AAB01426 


Homo sapiens 


Human TANGO 239. 


2700 


100 


327 


AAB00036 


Homo sapiens 


Human TANGO 239 partial sequence. 


2483 


97 


328 


gi7243117 


Homo sapiens 


mRNA for KIAA1368 protein, partial 

cds. 


5542 


100 


328 


AAY71460 


Homo sapiens 


Human semaphorin 6A-1 . 


5422 


98 


328 


gil0187891 


Homo sapiens 


unnamed protein product 


5422 


98 


329 


gil3676461 


Macaca 
fascicularis 


hypothetical protein 


2193 


75 



174 



wo 02/081731 
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Table 2 



S£Q ID NO: 


Accession No* 






Score 


Tricon tifv 


329 


gi4589566 


Homo sapiens 


mRNA for KIAA0961 protein, 
complete cds. 


2190 


75 


329 


gi456269 


Mus musculus 
domesdcus 


zinc finger protein 30 


2073 


71 


330 


AAB94295 


Homo sapiens 


Human protein sequence SEQ ID 


3062 


99 


330 




xiomo sapiens 


cLiiNA i*LJiz/oo US, Clone 
NT2RP2001576, weakly similar to 
HYPOTHETICAL 62.2 KD PROTEIN 
C4G8.12C IN CHROMOSOME 1. 




oo 


330 


gi7291781 


Drosophila 
melanogaster 


CG3419 gene product 


471 


32 


331 


gil2852801 


Mus musculus 


putative 


1185 


95 






Homo sapiens 


Human DNA sequence from clone 
Kr j-o4or 1 3 on cnromosome Ipzl . 1 - 
22. 1 Contains part of the PPAP2C 
(phosphatidic acid phosphatase type 2c} 
gene, ESTs, STSs and GSSs, con^>lete 
sequence. 


975 


100 


331 




nomu sapiens 


cuiNA rLJzujuu ns, cione xiiix^uoHOD. 


/ho 


Do 


332 


gil2309630 


Homo sapiens 


Human DNA sequence from clone 
AvTi i-HJooz^ on cxuomosome y 

leucine-rich repeat protein, ESTs, STSs 
and GSSs conmlete seauence 


3138 


100 


332 


AAB31161 


Homo sapiens 


Amino acid sequence of a human 
TOLLproteiiL 


2600 


86 


332 


gil3444976 


Homo sapiens 


unnamed protein product 


2600 


86 


333 


gi4240145 


Homn <kanipnc 


mRTJA fitT JCT A AOROSi nrnf^m nartial 

cds* 




00 


333 


gil4249936 


Homo sapiens 


Similar to S-adenosylhomocysteine 
hydrolase-like 1, clone 
IMAGE:3536052, inRNA, partial cds. 


3202 


100 


333 


AAW56097 


Homo sapiens 


Amino acid sequence of &e 0DD4b53 
enzyme. 


2466 


84 


334 


gil3625385 


Homo sapiens 


EPI64 (EPI64) mRNA, complete cds. 


1026 


46 


334 


AAB95321 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 17577. 


1023 


50 


334 


gil0435007 


Homo sapiens 


CDNAFU13130 fis, clone 
NT2RP3002972, weaUy similar to 
Halocyntfaia loretzi mRNA for HrPET- 
1. 


1023 


50 


335 


gil5862408 


Homo sapiens 


unnamed protein product 


2255 


95 


335 


gil3272520 


Mus musculus 


pancreatitis-induced protein 49 


2021 


85 


335 


AAE02778 


Homo sapiens 


Human PRO-C-MG.64 protein encoded 
by DNA-C-MG.64-1776 cDNA clone. 


1784 


95 


336 


gil5862408 


Homo sapiens 


imnamed protein product 


2281 


99 


336 


gil3272520 


Mus musculus 


pancreatitis-induced protein 49 


2047 


88 


336 


AAE02778 


Homo sapiens 


Human PRO-C-MG.64 protein encoded 
by DNA-C-MG.64.1776 cDNA clone. 


1810 


99 


337 


gi4545313 


Mus musculus 


prominin-like protein 


1021 


77 


337 . 


gil5042603 


Rattus 
norvegicus 


prominin 


647 


30 



175 
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SEQIDNO: 


Accession No. 


Species 






/o 

Identity 


337 


AAB94028 


Horno sapiens 


Human nmtein cemipnrp ^PO TD 

NO:14170. 






338 


gi2978255 


Mus musculus 


myeloid zinc finger protein-2 


212 


42 


338 


AAB54292 


Homo sapiens 


Hmnan pancreatic cancer antigen 
protein sequence SEQ ID NO:744. 


208 


30 


338 


gi8886436 


Homo sapiens 


myeloid zinc finger protein 1 splice 
variants (ZNF42) gene, complete cds, 
alternatively spliced. 


207 


42 


339 


gi3882269 


Homo sapiens 


mRNA for KIAA0774 protein, partial 
cds. 


5974 


99 


339 


gil2860422 


Mus musculus 


putative 


692 


96 


339 


gil5424451 


Homo sapiens 


hATIP3 


606 


36 


340 


AAB36617 


Homo sapiens 


Human FLEXHT-39 protein sequence 
SEQIDNO:39. 


584 


100 


340 


gi8218050 


Homo sapiens 


Human DNA sequence from clone 
RP1-187J1 1 on chromosonae 6ql 1.1- 
22.33. Contains the gene for a novel 
protein similar to S. pombe and S. 
cerevisiae predicted proteins, the gene 
for a novel protein similar to protein 
kinase C inlubitors, the 3' end of the 
gene for a novel protein similar to 
Drosophiia L82 and predicted worm 
proteins, ESTs, STSs, GSSs and two 
puiauve L^po isianos, con^ieie 


562 


100 


340 - 


gil3540300 


Mus musculus 


nucleolar protein C7B 


415 


66 


341 


eil4583268 




cyiopiasmic proiem miviNA, compieie 
cds. 


o2o 


62 


341 




x^xfilvj oapiCXlb 


echinoderm nucrotubule^associated 
con[q)lete cds. 




65 


341 


gi4406218 


HoiTK) sanietiQ 

aawaIJW irg J rl vim 


protein-like ^1AP2 mRNA, complete 
cds. 




CO 


342 


AAB60099 


Homo sapiens 


Human transport protein TPPT-19. 


1616 


93 


342 


gi7294748 


Drosophiia 
melanogaster 


CCYFfi\fi opne fiTArfiirt 




A1 


342 


gil4714781 








OJ 


343 


AAB94374 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14915. 


3938 


99 


343 


gil0434690 . 


Homo sapiens 


cDNAFU12921fis, clone 
NT2RP2004600. 


3938 


99 


343 


gi5689736 


Homo sapiens 


mRNA for myopodin. 


883 


34 


344 


AAY72604 


Homo sapiens 


Human Electron Transfer Protein, 
ETRN-2. 


111 


100 


344 


gil0953950 


Geochelone 
caitonaria 


alpha-D chain hemoglobin 


407 


54 


344 


gi4455876 


Caiiina 
moschata 


alpha D-globzn 


398 


53 


345 


AAY72604 


Homo sapiens 


Human Electron Transfer Protein, 
ETRN-2. 


668 


78 


345 


gil0953950 


Geochelone 


a^ha-D chain hemoglobin 


359 


43 



176 
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SEO ID NO: 


Accession No 




Di>G <* r*2n^ An 




/o 

Identity 














345 


gi4455876 


Cairina 
nwschata 


alpha D-globin 


349 


41 


346 






mixIN/V, Ci^iN/\ C/pDH /O lITOm 

clone DKF2p547C176). 




iUU 






nuxuo sapiens 


tiuroan \jKr a ukt i o i z poiypepuoe 
sequence SEQ ID NO:3624. 




JUU 






urosopiiua 
melanogaster 


\^vjiuuoo gene proouci 




Af\ 

4U 


347 


gil5778899 


Homo sapiens 


Similar to f>box only protein 17, clone 
complete cds. 


1537 


99 






Macaca 
fascicularis 


unnamed protein product 


1435 


95 




oil ^OIAKOI 


Homo sapiens 


Similar to f-box only protein. 17, clone 

mXJK^.yj /y iMAlJC.Jo04/0U, mKNA, 


Of? 

857 


56 


348 


AAG64860 


Homo sapiens 


Heart muscle cell differentiation related 


1079 


90 


348 


AAB99931 


Homo sapiens 


Human MesPl protein sequence SEQ 


1079 


90 


348 


gil3623241 


Homo sapiens 


Similar to mesoderm posterior 1, clone 
MGC:10676 IMAGE:3944350, mRNA, 
complete cds. 


1079 


90 


349 


ei4235144 




26L23), complete sequence. 


JO f 


1 HA 
lUU 


349 


gi8163824 


Homo sapiens 


krueppel-like zinc finger protein HZF2 


290 


74 


349 


AAY39779 


Homo sapiens 


CBMACD04 protein sequence. 


286 


71 


350 


gi7673618 


Mus musculus 


ubiquitin specific protease 


2016 


73 


350 




noiuu Sapiens 


mivjNA lor KiAAiuoj protem, partial 
cds. 


2UU0 


o4 


350 


gil6198231 


Drosophila 

melanogaster 


LD43147P 


1188 


46 


351 


gil3540193 


Homo sapiens 


isopentenyl pyrophosphate isomerase 1 
(mil), HT009-like protein, and 
isopentenyl pyrophosphate isomerase 
type 2 (IDI2) genes, complete cds. 


1202 


100 






Homo sapiens 


isopentenyl diphosphate dimethylallyl 
dinhosnhate isoinera<:e ^ /IDT^^ pmip 
exon 4 and complete cds. 


1202 


100 


351 


gil3925769 


Homo sapiens 


isopentenyl diphosphate dimethylallyl 
diphosphate isomerase 2 (IDI2) 
n^NA, complete cds. 


1202 


100 


352 


gil3561001 


Homo ss^iens 


Human DNA sequence fi-om clone 
RPl 1-528A10 on chromosome 6 
Contains an IMPDHl (IMP (inosine 
monophosphate) dehydrogenase I) 
pseudogene, an RNA helicase 
pseudogene, a novel gene similar to 
KIAA0161, ESTs, STSs and GSSs. 
complete sequence. 


950 


100 


352 


gil3991706 


Mus musculus 


UbcM4-interactuig protein 4 


655 


53 



177 
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S£QIDNO: 


Accession No. 


Species 


Description 


Score 


% 
Identity 


352 


gil 136384 


Homo sapiens 


Human mRNA for KIAAOl 61 gene, 

complete cds. 


651 


53 


353 


gil3561001 


Homo sapiens 


Human DNA sequence from clone 
RPl 1-528A 10 on chromosome 6 
Contains an IMPDHl (IMP (inosine 
monophosphate) dehydrogenase 1) 
pseudogene, an RNA helicase 
pseudogene, a novel gene similar to 
KIAAOl 61, ESTs, STSs and GSSs, 
complete sequence. 


709 


79 


353 


gil3991706 


Mus musculus 


UbcM4-interacting protein 4 


506 


45 


353 


gill36384 


Homo sapiens 


Human mRNA for KIAAOl 61 gene, 
complete cds. 


502 


44 


354 


AAB74446 


Homo sapiens 


Human protease-inhibitor like protein. 


2759 


100 


354 


gil2053227 


Homo sapiens 


mRNA; cDNA DKFZp434B044 (from 
clone DKFZp434B044); complete cds. 


2756 


99 


354 


gil5593902 


Homo sapiens 


unnamed protein product 


2743 


99 


355 


AAB94358 


Homo sapiens 


Human protein sequence S£Q ID 
NO:14883. 


1788 


98 


355 


gil0434632 


Homo sapiens 


cDNA FIJ12886 fis, clone 
NT2RP2004041, weakly similar to 
SYNAPSINS lA AND IB. 


1788 


98 


355 


gil2052738 


Homo sapiens 


mRNA; cDNADKFZp564H1322 
(from clone DKFZp564H1322); 
conqslete cds. 


1788 


98 




g] 13436437 


Homo sapiens 


Smular to RIKEN cDNA 573043 8N1 8 
gene, clone MuC/:4399 
IMAGE:2905957, mRNA, conq}lete 

cos. 


1634 


99 


356 


gil 5030091 


Mus musculus 


Similar to RIKEN cDNA 5730438N18 
gene 


1508 


91 


356 


AAB43372 


Homo sapiens 


Human ORFX 0RF3 136 polypeptide 
sequence ocv^ lu iNLi:oz/z. 


1464 


91 


357 


AAB73511 


Homo sapiens 


Human transferase HTFS-1 8, SEQ ID 

JNUllo. 


1880 


99 


357 


AAG74560 


Homo sapiens 


Human colon cancer antigen protein 


450 


98 


357 


AAG02792 


Homo sapiens 


Hmnan secreted protein, SEQ ID NO: 

Oo/3. 


324 


96 


358 


gi76736l8 


A^IIC TnilGdillIC 


uuiminui apc^LUW piuicttav 


971 1 




358 


gi5689463 


Homo sapiens 


mRNA for KIAA1063 protein, partial 
cds. 


2382 


78 


358 


gi5823525 


Drosophila 
melanogaster 


ubiquitin-specific protease nonstop 


1305 


49 


359 


AAB94775 


Homo sapiens 


Human protein sequence SEQ ID 
NO:15864. 


1022 


100 


359 


gil0435984 


IJomo sapiens 


cDNA FU13842 fis, clone 
THVRO1000793. 


1022 


100 


359 


gi2340162 


Xenopus 
laevis 


dsRBP-ZFa 


380 


44 


360 


gi3676086 


bacteriophage 
PS119 


gpl9 


291 


59 


360 


gil778468 


Escherichia 


hypothetical protein 


287 


59 



178 
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Accession No. 


Species 


Description 


Score 


% 
Identity 






coli 








360 


gil786768 


Escherichia 
coliK12 


bacteriophage lambda lysozyme 
homolog 


287 


59 


361 


gil3544003 


Homo sapiens 


clone IMAGE:3677165. mRNA, 
partial cds. 


2172 


88 


361 


gi3 169073 


Schizosacchar 
omyces pombe 


phenylalanyl^trna synthetase, 
mitochondrial precursor 


233 


33 


361 


gil3877969 


Arabidopsis 
thaliana 


putative pbenylalanine-flU^A 

synthetase 


228 


35 


362 


gi293694 


Mus musculus 


laminin receptor 


370 


49 


362 


gil3277921 


Mus musculus 


laminin receptor 1 (67kD, ribosomal 

protein SA) 


367 


49 


362 


gi4633839 


Mus musculus 


37kDa oncofetal antigen 


367 


49 


363 


gil5082271 


Homo sapiens 


testes development-related NYD-SP21 
mRNA, complete cds. 


1876 


100 


363 


gi6807923 


Homo sapiens 


mRNA; cDNA DKFZp434H092 (from 
clone DKFZp434H092); partial cds. 


1620 


100 


363 


gi7294427 


Drosophila 
melanogaster 


CG8797 gene product 


118 


21 


364 


AAE01355 


Homo sapiens 


Human gene 4 encoded secreted 
protein HRABV43. SEO ID NO:77. 


2724 


97 


364 


gil2836042 _j 


Mus musculus 


putative 


2607 


93 


364 


AAE01380 


Homo sapiens 


Human gene 4 encoded secreted 
protein HRABV43, SEQ ID N0:102. 


2500 


97 


365 


gil0439688 


Homo sapiens 


cDNA: FU23109 fis, clone 
LNG07754. 


2809 


99 


365 


gi9622093 


Mus musculus 


£-cadherin binding protein £7 


2768 


97 


365 


AAG01765 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
5846. 


737 


99 


366 . 


fiil2854995 


Mus musculus 


putative 


844 


71 


366 


gil0241691 


Homo sapiens 


Novel human gene mapping to 
chomosome 22. 


791 


99 


366 


gil4602790 


Homo sapiens 


DKFZP566F0546 protein, clone 
MGC:2444 IMAGE:2822570, mRNA, 
complete cds. 


791 


99 


367 


gil5082283 


Homo sapiens 


Similar to small glutamine-ricb 
tetratiicopeptide rqpeat (TPR)- 
containing, clone MGC: 10496 
IMAGE:3625993, mRNA, con^lete 
cds. 


720 


100 


367 


gi3377591 


Homo sapiens 


m length insert cDNA YN88E09. 


592 


100 


367 


gil5488015 


Homo sapiens 


TPR-containing co-chaperone mRNA, 
complete cds. 


450 


64 


368 


gi9104819 


Xylella 
fastidiosa 9a5c 


hypothetical protein 


151 


43 


368 


AAY59981 


Homo, sapiens 


Human endometrium tumour EST 
encoded protein 41. 


128 


46 


368 


AAE03351 


Homo sapiens 


Human gene 4 encoded secreted 
protein fragment, SEQ ID NO: 126. 


121 


58 


369 


gi5817053 


Homo sapiens 


mRNA; cDNA DKFZp586D0824 
(from clone DKFZp586D0824); partial 
cds. 


571 


43 


369 


gil5530285 


Homo sapiens 


clone MGC:24275 IMAGE:3950542. 


571 


43 



179 
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Accession No* 
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Score 


% 

T/fpnHfv 








mRNA, complete cds. 






369 


cil3569476 




iTTimiiTiitv.acQnrifltpH niirlpAtiHp ^ 




49 


370 


gi8453103 


Homo sapiens 


zinc finger protein mRNA, con^lete 
cds 


1296 


58 


370 


Bil5012179 




^ITIP fiTiOPr T^TrttPiH 1 n CtSC 0 1 r*l/\TiP 

MGC:15145 IMAGE:3949487, mRNA, 

pnnrnlptp pHc 




JO 


370 


ei498721 


Tf nmn <tnniPTic 


flTl OPT flTrttPlTl 


1970 




371 


eil 5929964 


Homo sanierm 


Similar \t\ livnntlipriMl nmtpin 

FIJI 0702 clone MGC-2 1954 
IMAGE'4391821 mRNA. comolete 
cds. 




inn 


371 . 


AAB42336 


Homo sapiens 


Human ORFX ORF2 1 00 polypeptide 
sequence SEQ ID NO:4200. 


932 


93 


371 


AAB93080 


Homo sapiens 


Human nrotein seouence SEO ID 
NO: 11912. 


923 


91 


372 


gi7328451 


Mus musculus 


sialidase 


893 


44 


372 


AAB93971 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14038. 


866 


42 


372 


AAW73964 


Homo sapiens 


Human sialidase protein sequence. 


866 


42 


373 


gil480005 


Mus musculus 


Zic4 protein 


1490 


86 


373 


AAB 14349 


Homo sapiens 


Human Zicl proteiiL 


1102 


67 


373 


gil 208429 


Homo sapiens 


niRNA for Tac nrotsin cnnmlete cAr 


1102 


U / 


374 


Ljil2860114 


Mus musculus 


putative 


876 


40 


374 


gil61958 


Trypanosoma 
cruzi 


suT&ce antigen 


177 


23 


374 


gil334643 


laevis 


APT?^ nrppiire/ir nr/^tptn 


174 


ZO 


375 


AAY99349 


Homo sapiens 


Human PROl 1 10 (UNQ553) amino 

acid refill ence ^FD TD MO*'^ 1 


1683 


100 


375 


AAB19729 


Homo sapiens 


Human SECX Qone 4339264-2 

pnmHpn nrnfpiTt 


1683 


100 


375 


AAB15549 


Homo sapiens 


Human immune system molecule from 
Incyte clone 2774913. 


1683 


100 


376 


gil2746394 


Homo sapiens 


CUG-BP and ETR-3 like factor 4 

/r^PT mPMA r>nTTin1ptP <««1c 


936 


100 


376 


gil3278792 


Homo sapiens 


Biuno (Drosophila) -like 4, RNA 
binding protein, clone MGC;2693 
IMAGE:2820541, mRNA, complete 
cds. 


911 


98 


376 


gil2804985 


Homo sapiens 


Similar to etrl, clone MGC:4320 
IMAG£:2820541, inRNA, complete 
cds. 


911 


98 


377 


gil2746394 


Homo sapiens 


CUG-BP and ETR-3 like factor 4 
(CELF4) mRNA, complete cds. 


905 


89 


377 


gil3278792 


Homo sapiens 


Bruno (Drosophila) -like 4, RNA 
binding protein, clone MGC:2693 
IMAG£:2820541, mRNA, conq)lete 
cds. 


880 


88 


377 


gil2804985 


Homo sapiens 


Similar to etrl, clone MGC;4320 
IMAG£:2820541, mRNA, complete 
cds. 


880 


88 
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378 


gil 284 1060 


Mus musculus 


putative 


oU9 


Id 


378 


gi7293285 


Diosophila 
melanogaster 


CG4768 gene product 


239 


37 


378 


gil938566 


Caenorhabditis 
elegans 


Hypodietical protein C48B6.3 


123 


38 


379 


gi3880385 


Caenorhabditis 
elegans 


predicted using Genefinder'-contains 
similanty to Pfam domam: PF01484 
(Nematode cuticle collagen N-tenninal 
domam), Score=5I.5, E-value=6.1e-12, 
N=l~cDNA EST yk94a4.5 comes &om 
uus gene-cDNA tSl yKy4a4.i comes 
from this gene-cDNA EST yk68dl .5 
comes irom mis gene-KUJiN a cq i 
yKOoai..p comes irommis geuc 


79 


35 




glOOo*r 


i^acnoriiaoaius 
elegans 


unnameu proiem prouuci 










l^o OTi n vti a /1 1 ^ o 

\^aCllUrilaDUlU.o 

elegans 


COUagCIl 


70 








xionio Sapiens 


jNovej von wiiieotana/mromoospoiin- 
like mature protein sequence. 


li^7 


04 






noino Sapiens 


iNovei VQii w mcuiuiu/ uiromoosporiii* 

Itt^ finlvTiPnHH^ 


#;s7 


04 


380 


gil2836633 


Mus musculus 


putative 


651 


59 


381 




A^iic fniic/*ii1iic 
IVXUo UiUoUlUUs 


ilUUaUIIlai piuiciu x^jo. 


1Q1 




381 


gi57119 


Rattus 


ribosomal protein L35a (aa 1-110) 


191 


53 


381 


gi 12846322 


Mus nmsculus 


putative 


191 


53 


382 




\4^iic Tniicr'iiliic 
ivlUa muot^u lua 


puutuvc 




71 


382 


gi7293113 


Drosophila 


CG12379 gene product 


283 


72 


382 


gi6042159 


Caenorhabditis 


Hypothetical protein F53A3.7 


226 


55 


383 


AAB81053 




miuiiUi piuLCiu xi* viutv mi mm auu 




inn 


383 


gil2841896 


Mus musculus 


putative 


925 


98 


383 




KfaKJ LlJJUla 

melanogaster 




612 


OJ 


384 


ml 0440373 


■Aixiuiv sai/iviia 


cds. 


1345 


03 


384 


gil0440396 


Homo sapiens 


mRNA for FLJ00031 protein, partial 
cds. 


647 


88 


384 


gil086626 


Caenorhabditis 
elegans 


Hypothetical protein C06A63 


273 


33 


385 


gil2053305 


Homo sapiens 


mRNA; cDNA DKFZp434G099 (from 
clone DKFZp434G099); complete cds. 


1210 


100 


38S 


gi25 16239 


Mus musculus 


Rab33B 


1138 


94 


385 


gil2836564 


Mus musculus 


putative 


1138 


94 


386 


gi7243247 


Homo sapiens 


mRNA for KIAA1433 protein, partial 
cds. 


3232 


100 


386 


AAB94053 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14222. 


3223 


99 


386 


gil 3096872 


Mus musculus 


Unknown (protein for MGC:7720) 


2906 


89 


387 


^14599491 


Homo sapiens 


small pioline-rich protein 2F (SPRR2F) 


458 


100 
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gene, complete cds. 






387 


gil4599489 


Homo sapiens 


small proline-rich protein 2E 
(SPRR2E) gene, complete cds. 


444 


95 


387 


gi338423 


Homo sapiens 


Human small proline rich protein 
(sprH) mRNA, clone 930. 


434 


94 


388 


gi6010699 


Rattus 
norvegicus 


F-box protein FBL2 


1449 


99 


388 


gil4043139 


Homo sapiens 


RIKEN cDNA 261051 1F20 gene, 
clone MGC: 15482 IMAGE:2987858, 
mRNA, complete cds. 


1383 


100 


388 


gil2848653 


Mus muscuhis 


putative 


1371 


99 


389 


gi2853265 


Rattus 
norvegicus 


jun dimerization protein 2 


800 


96 


389 


gil2248392 


Mus musculus 


transcriptional inhibitory iactor 


795 


95 


389 


gi6648146 


Homo sapienk 


chromosome 14 clone CTD-2317F5 
map 14q24.3, complete sequence. 


481 


100 


390 


gil5277240 


Homo sapiens 


genonaic DNA, chromosome 6p21.3, 
HLA Class I region, section 17/20. 


1296 


100 


390 


gill 875405 


Homo sapiens 


HZFwl protein mRNA, complete cds. 


1291 


99 


390 


gil 1875407 


Homo sapiens 


HZFw2 protein mRNA, complete cds. 


773 


99 


391 


gi6572201 


Homo sapiens 


Human DNA sequence from clone 
CITF22-27C3 on chromosome 
22ql3. 1-13.31 Contains a gene for a 
novel protein pJl 163J1 .2) and part of 
a gene for a novel protein (DU 163J1.3, 
similar to mouse B99), ESTs, STSs and 
GSSs, complete sequence. 


863 


100 


391 


gi4469186 


Homo sapiens 


Human DNA sequence from clone 
RP5-1 163J1 on chromosome 22ql3.2- 
13.33 Contains the 3* part of a gene for 
a novel KIAA0279 LIKE EGF-like 
domain containing protein (similar to 
mouse Celsrl, rat MEGF2), a novel 
gene for a protein similar to C. elegans 
B0035.16 and bacterial tRNA (5- 
Methylaininomethyl-2-thiouridylate)- ' 
Methyltransferases, and ihe 3' part of a 
novel gene for a protein similar to 
mouse B99. Contains ESTs, GSSs and 
putative CpG islands, coix9>lete 
sequence. 


863 


100 


391 


AAB92551 


Homo sapiens 


Hmnan protein sequence SEQ ID 
NO:10735. 


862 


96 


392 


gi5001720 


Mus musculus 


odd-skipped related 1 protein 


1413 


97 


392 


gil5778246 


Mus musculus 


odd-skipped related 2 


924 


66 


392 


gil5488723 


Mus musculus 


Unknown (protein for MGC:19171) 


924 


66 


393 


AAB94364 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14895. 


2700 


99 


393 


gil0434650 


Homo sapiens 


cDNA FU12895 fis, clone 
NT2RP2004187, weakly similar to 
ZINC FINGER PROTEIN 38. 


2700 


99 


393 


gil3623217 


Homo sapiens 


Similar to hypothetical protein 
FLJ12895, clone IMAGE:3533093, 


2150 


99 
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mRNA, partial cds. 






394 


gil2053105 


Homo sapiens 


mRNA; cDNA DKFZp434Kl 1 1 (from 
clone DKFZp434Kl 11); complete cds. 


3116 


100 


394 


gi2282582 


Mus musculus 


actin-binding protein 


2402 


74 


394 


AAR94386 


Homo sapiens 


Human neural cell nrotein marlfer 

RK/B. 


2400 


74 


395 


gi207145 


Rattus 
norvegicus 


synaptotagmin II 


2128 


95 


395 


gi7739733 


IVA^YK miicr*ii1iiG 


o yuapioiagjuin UL 


9101 




395 


gi688412 


Mus musculus 


synaptotagminII/IP4BP 


2121 


95 


396 




nomo oapicns 


v/oDi'-rejaieo proiem i mtuNA, 
complete cds. 




OA 






nonju Sapiens 


Human protein sequence SEQ ED 
NO: 10880. 




100 


396 




xiunio Sapiens 


j^ipiu associaiea protem (jLijrAir j 
2764333CD1. 




100 


397 




fascicularis 


bypodietical protein 




76 


397 


gi2447128 


Paramecium 

rtitrcaf'ta 
UUioalUl 

Ohinrella vinm 

1 


contains 10 ankyrin-like repeats; 
smular to human ankynn, corresponds 

t'O ^ixriQcPmt i^f*f*ACeimi ^JtitMl^^r 
Wj O Wlaa'I^iUl /\l«bCaSl(IIl l^iimDCl 

P16157 


212 


33 


397 


gi6634025 


Homo sapiens 


mRNA for KIAA0379 protein, partial 
cds. 


203 


38 


398 


AAB21047 


Homo sanien<: 


x-Lutuau uuuiciu ai/lUoUlIlUlIlg urUlClUy 

NuABP-51 




lUU 


398 


gi833629 


Xenopus 
laevis 


nucleoplasmin 


459 


49 


398 


gi64940 


Xenopus 
laevis 


nucleonlasmin fAA 1-200^ 


435 


HO 


399 


gil5919272 


Homo sapiens 


putative forkhead/winged-helix 
transcription factor (F0XP2) mRNA, 
complete cds. 


596 


84 


399 


gi2565057 


Homo sapiens 


CAGH44 mRNA, partial cds. 


596 


84 


399 


gil4582802 


Mus musculus 


forkhead-related transcription factor 2 


588 


82 


400 


AAB08199 


Homo sapiens 


Amino acid sequence of human 
diacylglycerol kinase beta 
(DAGKbeta). 


4217 


99 


400 


gil0279722 


Homo sapiens 


unnamed protein product 


4217 


99 


400 


gi485398 


Rattus 
norvegicus 


90kDa-diacyIglycerol kinase 


4046 


95 


401 


gi7670446 


Mus musculus 


unnamed protein product 


1295 


87 


401 


gil3185203 


Homo sapiens 


uimamed protein product 


799 


83 


401 


AAY31642 


Homo sapiens 


Human transport-associated protein-4 
(TRANP-4). 


466 


35 


402 


gil2837990 


Mus musculus 


putative 


985 


69 


402 


gi5668737 


Mus musculus 


UBE.1C2 


661 


50 


402 


AAB94645 


Homo sapiens 


Human protein sequence SEQ ID 
NO:15538. 


426 


52 


403 


gil0439821 


Homo sapiens 


cDNA: FU23209 fis, clone 
ADSH00512. 


2596 


99 


403 


gil0440353 


Homo sapiens 


mRNA for FLJOOOl 1 protein, partial 


1448 


97 
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cds. 






403 


gi8217420 


Homo sapiens 




Human DNA sequence from clone 
RP11-108L7 on chromosome 10. 
contains part of the gene for a novel 
Insulin-like growth factor binding type 
protein with Kazal-type serine protease 
inhibitor domain, the gene for a novel 
protein similar to rat tricarboxylate 
carrier, the gene for a novel PDZ 
(DHR, GLGF) domain protein, the 
gene for a novel protein similar to 
KIAA0552, KIAA0341 and Fugu 
hypothetical protein 2, tiie gene for a 
novel protein similar to Plasmodium 
POMl and C. elegans F46G11.1, a 
putative novel gene, the SEMA4G gene 
for semaphorin 40 and a novel gene, 
uontams hbis, oTSs, uSSs and seven 
putative CpG islands, conq>lete 
sequence. 


1026 


100 


404 




ZaUIUU oapiCU2> 


jiuman ukta likt jyoJ poiypepucie 
sequence SEQ ID NO:3966. 


223U 


96 


404 


ffi34 17297 




xiuiuan v^'momosome lo oAv^ Clone 
CIT987SK-A-635H12, conplete 




9o 


404 


gil5559282 


Homo sapiens 


clone MGC:20208 IMAGE:3936339, 


1021 


53 


405 


gil3365905 


Macaca 
fasciciilaris 


hypothetical protein 


1154 


99 


405 


AAB15537 


Homo sapiens 


Human immune system molecule from 

Incvte clnne 97^1 190 


911 


100 


405 


AAE04891 


Homo sapiens 


Human transporter and ion channei-4 


360 


39 


406 


gi262843 


Rattus sp. 


TIRllTntTSITlQTTIlHpT tmncnnri-PT 




OA 

yo 


406 


gi545078 


Rattus sp. 


W^ai^Vdr^Vdenendent tipnrntrancmiifpr 

transporter 






406 


AAR88390 


Homo sapiens 


Human neurotransmitter transporter 
protein. 






407 


AAB31212 


Homo sapiens 


Amino acid sequence of human 
polypeptide PRO6004. 


79R 


100 
1 vu 


407 


AAB44331 


Homo sapiens 


Human PR04993 protein sequence 
SEQIDNO:612. 


717 


100 


407 


gi4519558 


Rattus ' 
norvegicus 


Kilon 


667 


94 


408 


gil5277972 


Musmusculus 


Similar to DnaJ (Hsp40) homolog, 
subfamily B, member 1 


808 


49 


408 


gi7804472 


Mus musculus 


heat shock protein 40 


808 


49 


408 


AAB72675 


Homo sapiens 


Human HDJl. 


804 


48 


409 


gil2841015 


Mus musculus 


putative 


798 


52 


409 


AAB60114 


Homo sapi^ 


Human transport protein TPPT-34. 


787 


51 


409 


gil3435410 


Mus musculus 


Similar to RIKEN cDNA 1810012H11 
gene 


768 


53 


410 


gi488555 


Homo sapiens 


Human zinc finger protein ZNF135 


1241 


52 
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mRNA, complete cds. 






410 


AAY73346 


Homo sapiens 


HTRM clone 619699 protein sequence. 


1238 


49 


410 


AAB43912 


Homo sapiens 


Human cancer associated protein 
sequence SEQ ID NO: 1357. 


1231 


49 


411 


gi837292 


Rattus 
norvegicus 


S lOOAl gene product 


278 


59 


411 


AAB45531 


Homo sapiens 


Human SlOOAl protein. 


274 


57 


411 


gil 1228039 


Homo sapiens 


SlOOAl cDNA 


274 


57 


412 


AAB19851 


Homo sapiens 


Human muscle-specific protein Ozz. 


1504 


100 


412 


gil3929456 


Homo sapiens 


Human DNA sequence from clone 
RP3-337018 on chromosome 20ql2- 
13.1. Contains <he PLPT gene encoding 
Phospholipid Transfer Protein, the 
PPGB gene coding for Lysosomal 
Protective Protein precursor (EC 
3.4.16.5, Cathepsin A, 
Carboxypeptidase C) and the gene 
encoding peroxisomal acyl-CoA 
thioesterase (PTEl, thioesterase 11), 
four novel genes, the gene for a novel 
protein similar to Drosophila 

XNCUTauZcil ^INcUJ auXl UlC D CiXQ OI an 

isoform offhe TNNC2 gene for fast 

uupuiiiu v^^« v^uuiaius liucc v^pvj 

islands, ESTs, STSs and GSSSy 

comoletc seauence 


1504 


100 


412 


gil2835750 


Mus musculus 


putative 


1328 


89 


413 


gil2847182 


Mus musculus 


putative 


875 


87 


413 


ei4884173 




(from clone DKFZn564fi0982V nartial 
cds. 


0*TW 


inn 


413 


gil0047333 


Homo s aniens 


mRNA for KIAA1628 nrotein. nartial 
cds. 


346 


42 


414 


gi7959343 


Homo sapiens 


mRNA for KIAA1538 protein, partial 
cds. 


3286 


100 


414 


AAB42721 


Homo sapiens 


Human ORFX ORF2485 polypeptide 
sequence SEQ ID NO:4970. 


382 


100 


414 


AAB42764 


Homo sapiens 


Human ORFX ORF2528 polypeptide 
sequence SEQ ID NO:5056. 


355 


41 


415 


gil4043332 


Homo sapiens 


Similar to ring finger protein 23, clone 
MGC:2475 IMAGE:3051389, mRNA, 
conq^lete cds. 


1006 


43 


415 


gil0716078 


Mus musculus 


testis-abimdant finger protein 


995 


42 


415 


gil0716076 


Homo sapiens 


mRNA for testis-abundant finger 
protein, complete cds. 


966 


40 


416 


gi3599509 


Mus musculus 


rho/rac-interacting citron kinase 


1507 


61 


416 


gi3360512 


Rattus 
norvegicus 


Citron-K kinase 


1505 


89 


416 


gi3599507 


Mus musculus 


rho/rac-interacting citron kinase short 

isoform 


1503 


89 


417 


gi2358070 


Mus musculus 


trypsinogen 1 


898 


65 


417 


gi603903 


Gallus gallus 


trypsinogen 


408 


36 


417 


gi65163 


Xenopus 


trypsin piecursor 


405 


38 
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Score 


% 

luciiuiy 






laevis 








HIO 




P affile 

norvegicus 


ccreurogjiycan 


1 x^z 


Ol 


AlSi 




iiOuio Sopieiis 


jtiuman risXj/uD {UN\ij5t)y} profem 
sequence SEQ ID NO: 109. 




AH 




AAx2oy\)y 


Homo sapiens 


Human OrCo protem. 


570 


4o 


419 


AAM06489 


Homo sapiens 


Human foetal protein, SEQ ID NO: 
220. 


376 


82 


it 1 o 

419 


gil2835376 


Mus musculus 


putative 


230 


31 


419 


AAEG2058 


Homo sapiens 


Human four disuliide core domain 
(FDCD)-containing proteia 


222 


31 


420 


AAB42561 


Homo sapiens 


Human ORFX ORF2325 polypeptide 
sequence SEQ ID NO:4650. 


5075 


100 


42Q 


gi5419865 


Homo sapiens 


mRNA; cDNA DKFZp434N074 (from 
clone DKFZp434N074). 


5070 


99 


420 


gi4589532 


Homo sapiens 


mRNA for KIAA0944 protein, partial 
cds. 


3375 


61 


421 


gil 0438804 


Homo sapiens 


cDNA: FU22419 fis, clone 
HRC08593. 


1026 


60 


421 


gil3938187 


Homo sapiens 


hypothetical protein FU22419, clone 
MGC: 14900 IMAGE:3347783, mKNA, 
complete cds. 


1026 


60 


421 


gi6690339 


Mus musculus 


hematopoietic zinc finger protein 


717 


47 






Homo sapiens 


Human protein sequence SEQ ID 
NO:15739. 


lo7o 


99 






riomo sapiens 


ci/xNA r 1^14090 uSy Clone 
PLACE2000111. 


iO/o 


99 


422 


gi5706454 


Homo sapiens 


mRNA for Natural killer cell p44 
related gene 2 (NKp44RG2). 


158 


29 


423 


gil5026974 


Homo sapiens 


mRNA for obscurin (OBSCN gene). 


2713 


96 


423 


AAB95162 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 17205. 


1173 


86 




gll 3938 170 


Homo sapiens 
. 


clone IMAGE:2961284j mRNA, 
partial cds. 


540 


26 




gllZoOX 304 


Mus musculus 


putative 


523 


51 


424 


AAE02058 


Homo sapiens 


Human four disulfide core domain 
(FDCD)-contaiiiing protein. 


485 


38 


424 


gil2655452 


Homo sapiens 


mRNA for keratin associated protein 
4.7 (KRTAP4.7 gene). 


485 


40 


425 




com An c 

nomo Sapicus 


Human DNA sequence from clone 
RPl 1-550O8 on chromosome 20. 
Contains a novel gene encoding a 
protein kinase, an RPL7 (60S 
Ribosomal Protein L7) pseudogene, a 
CpG island, ESTs, STSs and GSSs, 
complete sequence. 


zuoz 


oo 
99 


425 


AAB65688 


Homo sapiens 


Novel protein kinase. SEQ ID NO: 216. 


1732 


100 


425 


AAB65690 


Homo sapiens 


Novel protein kinase. SEQ ID NO: 218. 


1184 


69 


426 


gi388518 


Homo sapiens 


Human V beta 5.5 niRNA for a new T 
cell receptor. 


627 


95 


426 


gi36173 


Homo sapiens 


H.sapiens rearranged T-cell receptor 
beta chain mRNA. 


613 


94 


426 


gil552509 


Homo sapiens 


Human gennline T-cell receptor beta 


606 


100 
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chain TCRBV13S1, TCRBV6S8A2T, 
TCRBV5S6A3N2T, TCRBV13S6A2T, 
TCRBV6S9P, TCRBV5S3A2T, 
TCRBV13S8P, TCRBV6S3A1N1T, 
TCRBV5S2, TCRBV6S6A2T, 
TCRBV5S7P, TCRBV13S4, 
TCRBV6S2A1N1T, TCRBV5S4A2T, 
TCRBV6S4A1. TCRBV23S1A2T, 
TGRBV12S1A1N2, TCRBV21S2A2, 
TCRBV8S1, TCRBV8S2A1T, 
TCRBV8S3, TCRBV16S1A1N1, 
TCRBV24S1A3T, TCRBV25S1A2PT, 
TCRBV26S1P, TCRBV18S1, 
TCRBV17S1A1T, TCRBV2S1, 
TCRBVIOSIP genes from bases 
257519 to 472940 (section 2 of 3). 






427 


AAE04752 


Homo sapiens 


Human beta-l,3'galactosyltransferase 
homologue, ZNSSP8. 


434 


33 


427 


gi 14597533 


Homo sapiens 


unnamed protein product 


434 


33 


427 


gil4039836 


Homo sapiens 


beta 1.3 N- 

acetyglucosaminyltransferase Lc3 
synthase mRNA, complete cds. 


434 


33 


428 


gi596142 


Homo sapiens 


Human proteasome subumt LMP7 
(allele LMP7C) mRNA, complete cds. 


628 


49 


428 


gi38482 


Homo sapiens 


H.sapiens gene for major 
histocomypatibility complex encoded 
proteasome subumt LMP7. 


624 


49 


illQ 
42 0 


gllUD4/4/ 


Homo sapiens 


ii.sapiens DMA, DMB, HLA-Zl, IPP2, 
JLMrZ, 1 Ar 1, l^Mr /, lArZ, JJUJtS, 

DQB2 and RINGS, 9, 13 and 14 genes. 


624 


49 


490 


J\J\\ji i*tij 


xiomo sapiens 


Human olfactory receptor polypeptide. 




iUU 


429 


AAG71594 


Homo sapiens 


Human olfactory receptor polypeptide. 


1344 


83 


429 


AAG72476 


Homo sapiens 


Human OR-like polypeptide query 


1011 


100 


430 


gil 0440063 


Homo sapiens 


cDNA: FU23392 fis, clone HEP17418. 


3045 


100 






jvius inus CUIUS 


unKnown ^roiem lor 
IMAGE:4207025) 


070iC 


oU 


430 


fiil770528 


Homo ssnifiim 


H saniens iriH^ A fWr trflHQlin 
x±»oiXj^i^gio II irXii xvA " *"f^nii 

associated zinc finger protein-1. 






431 


gil2859929 


Mus musculus 


putative 


917 


96 


431 


gil5207935 


Macaca 
fascicularis 


h>podietical protein 


301 


96 


431 


gil655637 


Mus musculus 


orf 


147 


27 


432 


gi4585414 


Bacteriophage 
933W 


hypothetical protein 


408 


42 


432 


gi4499798 


Bacteriophage 
933W 


orflS; homologous to ninG gene 


408 


42 


432 


gi588l629 


Bacteriophage 
VT2-Sa 


hypothetical protem 


408 


42 


433 


gil3161184 


Homo sapiens 


cytochrome P450 2S1 (CYP2S1) 
noRNA, complete cds. 


2615 


100 
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433 


AAB93056 




NO:11860, 


Z^Z / 


1 


433 


gil4042396 


Moitto saoiens 


cDNA FLJ14699 fis cinne 
NT2RP2006571 moHeratplv qimilflr tn 
CYTOCHROME P450 2G 1 ffiC 
1.14.14.1), 


z.^z / 




434 


gil3445575 


Homo sapiens 


facilitdtive pliicfi^e trancnnrtpr 

AHWAUIAUVW gAUWUdW * ' Ti 1 li>| W f 1 1 TrI 

GLUTIO (SLC2A10) mRNA, con5)lete 
cds. 


97^9 
z/ 




434 


gil3603727 


Homo sapiens 


glucose transporter (GLUTIO) mRNA, 
coni^lete cds. 


2752 


99 


434 


gil 1065680 


Homo sapiens 


Novel human gene mapping to 

cliromosoine 20 similfir tn mprnfirAnp 
transporters. 


2752 


99 


435 


gil3310486 


Homo sapiens 


C2H2 zinc finpernrotein TSAm^ 
gene, complete cds. 






435 


gi6688241 


Homo sapiens 


SALL3 pene exon*; 1 a 0 anH 




00 
yy 


435 


gil296845 


Mus mnsculus 


soalt Drotein 






436 


AAG71445 


Homo sapiens 


Htirasn fllfkctnrv T^cex\\€vr nr^lvn^nttHf* 

SEQn)NO:1126. 


l'^ t9 

U IZ 


OJ 


436 


AAG71447 


Homo sapiens 


Hirnian olfactory receptor polypq^tide, 
SEQIDNO: 1128. 


924 


61 


436 


gil5293797 


Homo sapiens 


clone 0R6M1 ol&ctory receptor gene, 
partial cds. 


829 


78 


437 


AAB65297 


Homo sapiens 


Human PR09828 protein sequence 
SEQIDN0:511. 


1360 


100 


437 


AAG89178 


Homo sapiens 


Himian secreted protein, SEQ ID NO: 
298. 


1360 


100 


437 


AAB84652 


Homo sapiens 


Amino acid sequence of fibroblast 
growth &ctor homologue zFGF12. 


1360 


100 


438 


gi53756 


Mus musculus 


minopontin precursor (AA -66 to 272) 


1521 


100 


438 


gi297546 


Mus musculus 


osteopontin 


1516 


99 


438 


gi50864 


Mus musculus 


T lymphocyte activation protein 


1514 


99 
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1 


PF00204 


Zinc finger C-x8-C-x5-C-x3-H type 
(and similar). 


PF00204 1 1 .59 9.7006-12 426-437 


1 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 3.667e-09 33-42 


2 


BL00291 


Prion protein. 


BL00291A4.49 8.759e-09 185-220 


3 


PF01105 


emp24/gp25L/p24 family. 


PF01105B 25.12 l.OOOe-40 178-230 


4 


BL00307 


Legume lectins beta>chain proteins. 


BL00307G9.91 8.531e-10 678-689 


4 


PF00922 


Vesiculovirus phosphoproteia 


PF00922A 19.17 8.862e-09 281-315 


6 


BLX)1159 


WW/rsp5/WWP domain proteins. 


BL01159 13.85 6.073e-O9 61-76 


6 


BL00591 


Glycosyl hydrolases family 10 proteins. 


BL00591G 9.65 9.167e-09 311-323 


7 


BL01159 


WW/rsp5/WWP domain proteins. 


BIj01159 13.85 6.073e-O9 61-76 


7 


BL00591 


Glycosyl hydrolases family 10 proteins. 


BL00591G 9.65 9.167e-09 311-323 


9 


BL00913 


Jron-containing alcohol dehydrogenases 
proteins. 


BL00913D 24J20 8.981e-17 170-204 
BL009I3C 7.62 4.375e-l I 136-146 
BL00913B 10.94 7.706e-ll 86-102 


10 


BL00913 


Iron-containing alcohol dehydrogenases 
proteins. 


BL00913D 24.20 8.981e-17 218-252 
BL00913C 7.62 4.375e-ll 184-194 
BL00913B 10.94 7.706e-ll 134-150 


11 


BL50062 


BCL2-like apoptosis inhibitors (spans 
partofBH3,BHlandBH. 


BL50062C 6.66 8.500e-ll 349-358 


14 


BL01144 


Ribosomal protein L3 le proteins. 


BL01144 25.07 9.069e-26 78-130 


15 


PF00204 


Zinc finger C-x8-C-x5-€-x3-H type 
(and similar). 


PF00204 11.59 6.694e-10 355-366 


15 


BL00904 


Protein prenyltransfeiases alpha subunit 
repeat proteins proteins. 


BL00904A 8.30 4.000e-09 485-535 


15 


BL00415 


Synapsins jnotdns. 


BL00415N 4.29 6.727e-12 483-527 
BL00415N 4.29 2.7746-09 1 1 8-600 
BL00415P 2.37 4.290e-09 819-855 
BL00415Q 2.23 6.534e-09 474-510 


15 


PR00049 


WILMS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 4.S00e-14 490-505 
PR00049D 0.00 2.500e-12 489-504 
PR00049D 0.00 4.000e-12 491-506 
PR00049D 0.00 8.201e-ll 488-503 
PR00049D 0.00 1.205e-10 492-507 
PR00049D 0.00 3.746e-09 487-502 
PR00049D 0.00 5.271e-09 485-500 
PR00049D 0.00 6.644e-09 493-508 


15 


DM00215 


PROUNE-RICH PROTEIN 3. 


DM00215 19.43 9.022e-13 471-504 
DM00215 19.43 1.458e-09 483-516 
DM00215 19.43 2.6786-09469-502 
DM00215 19.43 5.424e-09 468-501 
DM00215 19.43 8.017e-09 470-503 
DM00215 19.43 9.085c-09 466-499 
DM00215 19.43 9.237e-09 484-517 


15 


BL01113 


Clq domain proteins. 


BL01113A 17.99 9.308e-09 116-143 


15 


BL00048 


Protamme PI proteins. 


BL00048 6.395.263e-10 196-223 BL00048 
6.39 3.3636-09 262-289 BL00048 639 
9.1126-09 184-211 


17 


PR00773 


GRPE PROTEIN SIGNATURE 


PR00773D 16.14 5.9220-09 215-235 
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23 


PD00930 


PROTEIN OTP ASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 7.300e-26 600-203 
PD00930A 25.62 1.5146-16 497-523 


23 


BL50002 


Src homology 3 (SH3) domain proteins 
profile. 


BL50002A 14.19 4.000e-12 727-746 


23 


PF00182 


GTPase-activator protein for Rho-like 
GTPases 


PP0O182B 14.20 7.333e-12 549-128 


25 


BL00375 


UDP-glycosyltransferases proteins. 


BL00375F 16.99 7.061e-35 291-336 
BL00375C 18.27 2.615e-19 126-150 
BL00375D 14.56 9.000e-17 192-220 
BL00375B 21.22 8.627e-16 67-108 
BL00375G 13.01 4.577e-13 390430 


28 


BL01170 


Ribosomal protein L6e proteins. 


BLOl 170A 12.34 9.143e-40 139-175 


28 


PD01457 


RJBOSOMAL PROTEIN 40S ZINC- 
FINGER METAL. 


PD01457A 16.51 9.845e-09 67-112 


29 


BL00359 


Ribosomal protein LI 1 proteins. 


BL00359B 23.07 4.231e-24 56-97 
BL00359C 22.18 6.148e-22 111-145 
BL00359A 20.66 4.000e-21 20-56 


29 


BL01108 


Ribosomal protein L24 proteins. 


BLOl 108A 20.33 l.OOOe-08 40-73 


30 


PR00983 


CYSTEINYL^TRNA SYNTHETASE 
SIGNATURE 


PR00983D 14.16 3.209e-23 270-292 
PR00983C 11.27 3.415e-21 239-258 
PR00983A 11.10 1.878e-12 75-87 


30 


BL00178 


Aminoacyl-transfer RNA synthetases 
class-I proteins. 


BL00178B 7.11 2.286e-09 314-325 


31 


PR00718 


PHOSPHOLIPASE D SIGNATURE 


PR00718E 8.61 l.OOOe-08 327-351 


32 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 6.133e-10 49-58 


33 


PF00992 


Troponin. 


PF00992A 16.67 7.972e-10 10-45 PF00992A 
16.67 5.145e-09 17-52 PF00992A 16-67 
6.684e-09 56-91 


34 


BL01019 


ADP-ribosylation factors family 
proteins. 


BL01019A 13.20 8.000e-ll 68-108 


34 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 1727 4.938e-20 75-98 
PR00449A 13.20 L900e-15 34-56 
PR00449E 13.50 6.870e-15 173-196 
PR00449B 14.34 1.360e-10 57-74 
PR00449D 10.79 5.364e-09 137-151 


37 


PR00764 


COMPLEMENT C9 SIGNATURE 


PR00764F 16.89 7.783e-ll 204-225 


37 


DM01077 


SEX HORMONE-BINDING 
GLOBULIN. 


DM01077A 16.30 l.lo5e-l 043-90 


37 


BL00279 


Membrane attack con^lex components 
/perforin proteins. 


BL00279E 37.11 9.163e-09 187-235 


38 


PR00832 


PAXILLIN SIGNATURE 


PR00832B 9.87 6.284e-10 768-792 


38 


PR00806 


VINCUUN SIGNATURE 


PR00806A 6.63 9.260e-09 766-777 


38 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.661e-15 766-781 
PR00049D 0.00 3.250e.l2 764-779 
PR00049D 0.00 7.277e-ll 765-780 
PR00049D 0.00 8.786e-10 763-778 
PR00049D 0.00 9.390e-09 762-777 


40 


BL00226 


Ihtennediate filaments Proteins. 


BL00226D 19.10 3.1726-34 397-444 
BL00226B 23.86 5.929e-23 230-278 
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BL00226C 13.23 4.808e-21 296-327 
BL00226A 12.77 5.065e-13 129-144 
BL00226B 23.86 6.400e-10 181-229 


41 


BL00243 


Jntegrins beta chain <^stdne-rich 
domain proteins. 


BIJ002431 31.77 2.0146-09 156-199 

BL002431 31.77 5.437e-09 159-202 
BL002431 31.77 5.690e-09 30-73 


41 


BL01208 


VWFC domain proteins. 


BL01208B 15.83 5.865e-09 184-199 


41 


BIJ00203 


Vertebrate metallothioneins proteins. 


BL00203 13.94 3.6706-1166-112 BIj00203 
13.944.6596-1140-86 BL00203 13.94 
7.429e-ll 70-116 BL00203 13.94 9.505e-l 1 
140-186 BL00203 13.94 2.723e-10 21-67 
BL00203 13.94 2.723e-10 61-107 BL0Q2O3 
13.94 3. 147e-10 105-151 BL00203 13.94 
4.064e-l 022-68 BL00203 13.94 5.213e-10 
161-207 BL00203 13.94 6.457e- 10 26-72 
BL00203 13.94 7.032e-10 184-230 BL00203 
13.947.2236-1080-126 BL00203 13.94 
9.043e-10 130-176 BL00203 13.94 1.735e- 
09 175-221 BL00203 13.94 3.020e-09 150- 
196 BL00203 13.94 3.204e-09 65-1 11 
BL00203 13.94 3.2966-09 95-141 BIj00203 
13.94 3.6636-09 135-181 BL00203 13.94 
5.0416-0947-93 BL00203 13.94 5.041e-09 
85-131 BL00203 13.94 5.5006-09 100-146 
BL00203 13.94 5.867e-09 126-172 BL00203 
13.94 5.9596-09 90-136 BL00203 13.94 
6.6946-09 170-216 BL00203 13546.878e- 
09 151-197 BL00203 13.94 6.969e-09 17-63 
BL00203 13.94 7.337e-09 115-161 BL00203 
13.94 7.4296-0971-117 B1j00203 13.94 
7.7046-09 171-217 BL00203 13.94 8.531e- 
09 155-201 BL00203 13.94 8.714e-09 165- 
211 BL00203 13.94 9.2656-09 116-162 


41 


BL00269 


Mammalian defensins proteins. 


BL00269C 16.52 9,289e-09 28-57 
BL00269C 16.52 9.289e-09 72-101 


41 


PD02283 


PROTEIN SPORULATION RH>EAT 
PRECU. 


PD02283C 17.54 5.050e-09 138-166 
PD02283C 17.54 5.175e-09 24-52 
PDQ2283C 17.54 5.175e-09 68-96 
PDb2283C 17.546.7386-09 113-141 
PD02283C 17.54 7.1886-09 163-191 
PD02283C 17.54 7.750e-09 173-201 
PD02283C 17.54 7.975e-09 128-156 
PD02283C 17.54 8.650e-09 148-176 
PD02283C 17.54 9.325e-09 118-146 


41 


BL00799 


Granulins proteins. 


BL00799D 12.41 7.661e-09 49-96 
BL00799G 9.41 l.OOOe-08 39-80 


43 


BL00291 


Prion protein. 


BL00291A 4.49 4.414e-09 47-82 


44 


PF00084 


Sushi domain proteins (SCR repeat 
proteins. 


PF00084B 9.45 7.188e-10 1549-1561 


44 


BL00142 


Neutral zinc metallopeptidases, zinc- 


BL00142 8.38 2.286e-09 730-741 
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binding region proteins. 




44 


PR00480 


ASTACIN FAMILY SIGNATURE 


PR00480B 15.41 3.314e-09 725-744 


45 


BL00414 


Profilin proteins. 


BL00414D 15.59 9.182e-1081-108 


48 


PR00837 


ALLERGEN VSATPX-l FAMILY 

SIGNATURE 


PR00837D 11.12 6.023e-09 22-36 


48 


BL01009 


Extracellular proteins SCP/Tpx- 
1/Ag5/PR-1/Sc7 proteins. 


BL01009E 13.50 8.204e-09 21-37 


49 


BIj00284 


Seipins proteins. 


BL00284A 15.64 2.350e-20 85-109 
BL00284D 16.34 4.240e-19 323-350 
BL00284C 28.56 5.600e-17 216-258 
BL00284E 19.15 7.500e-14 408-433 
BL00284B 17.99 9.379e-13 189-210 


50 


BL01283 


T-box domain proteins. 


BL01283A 24.15 2.125e.39 148-196 
BL01283B 23.17 9.438e-34 208-250 
BL01283D 11.70 7.868e-31 298-331 
BL01283C 13.05 8.448e-16 260-274 


50 


PR00937 


T-BOX DOMAIN SIGNATURE 


PR00937A 15.25 9.182e-26 156-181 
PR00937D 13.41 7.375e-17 259-274 
PR00937B 14.58 8.6156-15 223-237 
PR00937E 11.86 8.541e-14 301-315 
PK0u937r 12.^3 1.45Ue-lz j/z-JJl 
PR00937C 10.51 l.OOOe-11 240-250 


50 


PR00938 


BRACHYURY PROTEIN FAMILY 
SIGNATURE 


PROOyioC oJZo O.D4 fC'-W 


50 


PR00427 


INTERLEUKIN-8 RECEPTOR 
SIGNATURE 


FKU0427A lO-iU o. Iloo-Utf 4lo-4:>l 


51 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270D 24.66 8.054e-09 50-86 


52 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 2.543e-13 181-221 


52 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 7.o»2e-ll 150-172 
PR00245C 7.84 5.286e-10 290-306 


52 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237C 15.69 3.700e4)9 195-21o 
PR00237G 19.63 8.535e^ 326-353 


53 


PR00050 


COLD SHOCK PROTEIN 
SIGNATURE 


PR00050A 11.28 3.143e-12 42-58 
PR00050C 9.82 9.151e-ll 85-104 


53 


BL00352 


'Cold-shock' DNA-binding domain 
proteins. 


BL00352B 23.66 2.881e-13 71-1 10 
BL00352A 12.19 1327e-10 42-57 


So 


BL01173 


Lipolytic enzymes Cj-D-A-O lamily, 
histidine. 


/II I^OT A AA'Jt* 17 14/1 1/S7 
Dl^vl 1 fji> iJ^Zf 1.*f0^e-l / IW-lO/ 

BL01173C 8.98 4.349e-14 182-196 
BL01173A9.41 1.818e-13 454-467 
BLOl 173C 8.98 6.5536-13 495-509 

BL01173A 9.41 8,364e-13 107-120 


57 


PR00321 


GAMMA G-PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00321C 15.39 2.473e-12 123-141 


58 


PR00937 


T-BOX DOMAIN SIGNATURE 


PR00937A 15.25 l.OOOe-24 117-142 
PR00937D 13.41 5.500e-l 8 220-235 
PR00937B 14.58 5.235e-13 184-198 
PR00937F 12.53 1.450e-12 293-302 
PR00937E 11.86 L918e-12 259-273 
PR00937C 10.51 3.133e-ll 201-211 
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58 


BL01283 


T-box domain proteins. 


BIJ01283A 24.15 l.OOOe-40 109-157 

BL01283C 13.05 8. 286e- 17 221-235 
rJJLUizojjj 1 1. /u D. /uye-n zoyouz 


58 


PR00938 


BRACHYURY PROTEIN FAMILY 
SIGNATURE 


PRO0938C 8.28 7.384e-09 225-243 


59 


PD02059 


CORE POLYPROTEIN PROIEIN 
OAG CON TAINS: P. 


PD02059A 28.10 2.694e-09 116-157 


63 


BL00196 


Ribosomal protein L30 proteins. 


BIjOUI9o 34.3e 3.250e-15 40-97 


64 


BL00226 


Ihtemiediate filaments proteins. 


BL00226B 23.86 1.205e-31 264-312 


64 


BL01305 


moaA / nifB / pqqE family proteins. 


BL01305B 10.95 8.875e-09 78-88 


68 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 6.727e-13 33-67 


69 


PR00874 


FUNGI-IV METALLOTfflONEIN 
SIGNATURE 


PR00874C 4.37 7.214e-10 68-83 


69 


PD00866 


GLYCOPROTEIN PROTEIN SPIKE 
E2 PRECURSOR PEPLOMER. 


PD00866L 3.73 6.564e-10 1-11 PD00866L 
3.73 1.443e-09 26-36 


69 


BL00026 


Chitin recognition or binding domain 
proteins. 


BLD0026 12.95 3.013e-09 48-69 


69 


DM01724 


kw ALLERGEN POLLEN CIMl HOL- 
LL 


DM01724 8.14 3.250e-09 10-30 


69 


BL01208 


VWFC domain proteins. 


BL01208B 15.83 6.838e-09 111-126 


69 


BL00243 


Integnns beta chain cysteine-rich 
domain proteins. 


BIJ002431 31.77 4.838e-10 106-149 
BL002431 31.77 7.221e-10 18-61 BL00243I 
31.77 1.761e-09 41-84 BL002431 31.77 
3.4086-09 31-74 BL002431 31.77 7.465e^9 
71-114 


69 


BL00203 


Vertebrate metallothioneins proteins. 


BL00203 13.94 4.107e-13 66-112 BL00203 
13.94 2.138e-12 92-138 BL00203 13.94 
1.0996-1128-74 BIj00203 13.94 3.1766-11 
82-128 BL00203 13.94 3.3746-1 1 87-133 
BL00203 13.94 5.8466-11 77-123 BL00203 
13.94 7.2316-11 102-148 BL00203 13.94 
1.670e-10 97-143 BL00203 13.94 2.532e-10 
103-149 BL00203 13^4 S.021e-10 88-134 
BIJ00203 13.947.1286-1038-84 BL00203 
13.94 7.1686-10107-153 BL00203 13.94 
7.7026-10 73-119 BL00203 13.94 9.426e-10 
25-71 BL00203 13.94 1.9186-09 101-147 
BL00203 13.94 2.7456-09 27-73 BL00203 
13.94 4.0316-09 71-117 BL00203 13.94 
4.8576-09 36-82 BL00203 13.94 5.0416-09 
98-144 BL00203 13.94 5.1546^09 6-52 
BL00203 13.946.4186-0976-122 BIj00203 
13.94 7.9806-09 91-137 BL00203 13.94 
8.2556-09 13-59 BIj00203 13.94 8.898e-09 
48-94 


69 


PR00876 


NEMATODE METALLOTHIONEIN 
SIGNATURE 


PR00876B 7.66 9.5146-09 80-94 


73 


PR00875 


MOLLUSC METALLOTHIONEIN 
SIGNATURE 


PR00875A 5.83 9.679e-10 17-29 
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74 


PR00185 


HISTONE H4 SIGNATURE 


PR00185B 13.68 8.8 88e-09 364-384 


86 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 7.000e.l3 200-213 


86 


BL00028 


Zinc linger, L2H2 type, oomain 
proteins. 


16.07 1.900e.l0 184-201 BL00028 16.07 
6.100e-10 371-388 BL00028 16.07 6.914e- 
09 317-334 


86 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


'DT>AAA>IO'D ^ AO "7 1 <»Q« AO lOT OAT 


87 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR. 


PD02870D 15.74 8.468e-09 358-393 


88 


BL00048 


Protamine PI proteins. 


82 BL00048 6.39 5,500e-10 70-97 BL00046 
6.39 2.350e-09 62-89 BL00048 6.39 3.700e- 
09 60-87 BIJ00048 6.39 5.050e-09 63-90 

T%T f\f\/\AO ^ ^A f OOO^ An OO DT AAAilO 

BL00048 6.39 o.2o8e-09 6I-08 BL00u48 
6.39 9.438e-09 71-98 


89 


PR00320 


G-PROTEIN BETA WD-40. REPEAT 
SIGNATURE 


UOAA^OA/^ 11A1 Q OOAa 1 A OAO OlO 

PR00320B 12.19 9.486e-10 202-217 
PR00320A 16.74 8.902e-09 202-217 


nA 
90 




FKBP-type peptidyl-prolyl cis-traas 
isomerase proteins. 


mLUi/H-j jJ3 zj.oo J.oo*^e-zo iuo-i*tu 
BL00453A 15.57 l.OOOe-15 81-96 
BL00453C9.72 l.OOOe-12 147-160 


92 


PR00299 


ALPHA CRYSTALLIN SIOsTATURE 


PR00299B 17.53 7.211e-09 324-337 


93 


PF00676 


Dehydrogenase El component. 


PF00676D 14.40 4.857e-13 421-441 
PF00676C 16.88 1.931e-10 389-413 
PF00676B 24.71 5.433e-10 192-230 


96 


BL00824 


Elongation factor 1 beta/betaVdelta 
chain proteins. 


BL00824B 9.21 3.919e-09 1472-1492 


99 


PR00417 


PROKARYOTICDNA 
TOPOISOMERASE I SIGNATURE 


PR00417A 12.66 5.415e-09 866-880 


102 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.936e-29 17-56 


102 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2.059e-14 435-452 BL00028 

\C A*7 *t OCO.> 1>l ^^1 liT AAAOO 1 iC AO 

10.07 7.353e-14 3D 1-JOo olJUUUIzo lu.U/ 
2.350e-13 295-312 BL00028 16.07 9.100e- 
13 491-508 BL00028 16.07 2.174e-12 463- 
480 BL00028 16.07 8.826e-12 211-228 
BL00028 16.07 2.038e-Il 379-396 BIj00028 
16.07 2.385e-ll 323-340 BL00028 16.07 
3.423e-ll 239-256 BIj00028 16.07 9.654e- 
11 407-424 BL00028 16.07 l.OOOe-10 267- 
284 


102 


BL00479 


Phoibol esters / diacylglycerol binding 
domain proteins. 


BIj00479A 19.86 6.362e-09 366-3.89 


102 


PD02462 


PROTEIN BOLA TRANSCRIPTION 
REGULATION AC. 


PD02462A 22.48 7.69Se-09 204-239 


102 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 l.OOOe-15 460474 
PR00048A 10.52 l.OOOe-14 432-446 
PR00048A 10.52 3.250e-14 320-334 
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PR00048A 10.52 4.750e.l4 348-362 
PR00048A 10.52 6.250e-14 376-390 
PR00048A 10.52 3.133e-13 292-306 
PR00048A 10.52 1.529e-12 488-502 
PR00048B 6.02 l.OOOe-1 1 336-346 
PR00048B 6.02 9.308e*ll 224-234 
PR00048B 6.02 2.688e-10 476^86 
PR00048B 6.02 3.250e-10 280-290 
PR00048A 10.52 5.6966-10404-418 
PR00048A 10.52 6.087e-10 264-278 
PR00048B 6.02 6.187e-10 420-430 
PR00048A 10.52 7.214e-l0 236-250 
PR00048B 6.02 8.875e-10 364-374 
PR00048B 6.02 3.368e-09 171-181 
PR00048B 6.02 4.316e-09 448-458 


103 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 9.438e-37 10-49 


103 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 5.500e-13 413-430 BL00028 
16.07 l.OOOe-12 273-290 BIj00028 16.07 
1.783e-12 357-374 BL00028 16.07 7.577e- 
11 301-318 BL00028 16.07 9.308e-ll 441- 
458 BL00028 16.07 9.308e-ll 469-486 
BL00028 16.07 1.300e-10 329-346 


103 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATLFRE 


PR00048A 10.52 7.000e-14 354-368 
PR00048A 10.52 2.286e-13 298-312 
PR00048A 10.52 9.357e-13 270-284 
PR00048A 10.52 3.209e-12 410-424 
PR00048B 6.02 5.000e-12 286-296 
PR00048B 6,02 l.OOOe-11 342-352 
PR00048B 6.02 l.OOOe-11 370-380 
PR00048B 6,02 1.125e-10 314-324 
PR00048A 10.52 2.565e-10 466-480 
PR00048A 10.52 4.522e-10 438-452 
PR00048B 6.02 1.474e-09 454-464 
PR00048A 10.52 3.520e-09 326-340 
PR00048B 6.02 4.789e-09 482-492 


103 


FDOOOdo 


PROTEIN ZINC-FINGER METAL- 


PD00066 13.92 8.200e-16 289-302 PD00066 

1 j.yz J. /o5^e-u ji/-jju i^uuuuoo 
6.538e-15 373-386 PD00066 13.92 2.800e^ 
14 345-358 PD00066 13.92 4.600e-14 457- 
470 PD00066 13.92 4.130e-ll 40M14 
PD00066 13.92 9.654e-10 429-442 PD00066 
13.92 5.200e-09 261-274 


103 


BL01024 


Protein phosphatase 2A regulatory 
subunit PR55 proteins. 


BL01024H 13.88 7.353e-09 163-216 


104 


PD01781 


PROTEASE IMMUNOGLOBULIN 
PRECURSO. 


PD01781B 27.55 8.680e-09 325-369 


105 


PD01781 


PROTEASE IMMUNOGLOBULIN 
PRECURSO. 


PD01781B 27.55 8.680e-09 379-423 


107 


PR00939 


C2HC-TYPE ZINC-FINGER 


PR00939B 13.27 3.209e-09 1302-1311 
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SIGNATURE 




108 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 

1 


PD00066 13.92 2.800e-14 279-292 PD00066 
13.92 4.600e-14 307-320 PD00066 13,92 
LOOOe-13 335-348 PD00066 13.92 7.500e- 
13 363-376 


108 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 7.882e-14 319-336 BL00028 
16.07 7.300e-13 347-364 BL00028 16.07 
4.913e-12 291-308 BL00028 16.07 2.500e- 
10 263-280 BL00028 16.07 1.257e-09 375- 
392 


108 


PR00048 


C2H2.TyPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 4.214e.l3 288-302 
PR00048B 6.02 5.000e-12 304-314 
PR00048A 10.52 6.824e.l2 372-386 
PR00048A 10.52 7.353e-12 344-358 
PR00048A 10.52 7.158e-ll 316-330 
PR00048B 6.02 7231e-ll 276-286 
PR00048B 6.02 l.OOOe-09 332-342 
PR00048B 6.02 6.211e-09 388-398 


108 


BLOOl IS 


Eukaiyotic RNA polymerase II 
heptapeptide repeat proteins. 


BLOOl 15Z 3.12 8.842e-18 96-145 
BLOOl 15Z 3.12 7,144e-17 89-138 
BL00115Z 3.12 6.888e-16 103-152 
BLOOl 15Z 3.12 7,791e-15 82-131 
BLOOl 15Z 3.12 3.947e-14 61-110 
BLOOl 15Z 3.12 7.292e-14 117-166 
BLOOl 15Z 3.12 9.164e-14 110-159 
BLOOl 15Z 3.12 l.OOOe-13 75-124 
BLOOl 15Z 3,12 3.871e-13 54-103 
BLOOl 15Z 3.12 6.819e-13 68-117 
BL00115Z 3.12 4.168e-ll 124-173 
BLOOl 15Z 3.12 9.651e-10 47-96 BLOOl 15Z 
3.12 7,485e-09 71-120 BL001152 3.12 
9.669e-09 78-127 


109 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 5.680e-33 391-420 
PR00193C 12.60 4.789e-32 156-184 
PR00193B 11.69 l.o92e-26 110-136 
rKUUiyjc. VpAi D.DUUe-^1 44D-474 
PR00193A 15.41 4.130e-20 54-74 
PR00193F 19 47 5 091e-12 444-47^ 


110 


BL00239 


Receptor tyrosine kinase class n 
proteins. 


BL00239B 25.15 2.985e-16 67-115 


no 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B nn 8.660e-13 132-151 


110 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 4.462e-25 132-163 
BLOOIOTB 13.31 6.143e-10 197-213 


no 


DM00406 


GLIADIN. 


DM00406 7.73 1.800e-09 818-831 


no 


BL00904 


Protein prenyltransferases alpha subunit 

repeat proteins proteins. 


BL00904A 8.30 5.596e-Q9 815-865 


no 


BL00415 


Synapsins proteins. 


BL00415A 6.15 7.684e-09 796-837 


no 


DM00215 


PROUNE-RICH PROTEIN 3. 


DM00215 19.43 2.373e-09 801-834 
DM00215 19.43 7.712e-09 797-830 
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110 


PR00209 


ALPHA/BETA GLIADIN FAmY 
SIGNATURE 


PR00209B 4.88 4.188e-09 817-836 
PR00209C 4.56 8.929e-09 790-804 


111 


BL00678 


Tip-Asp (WD) repeat proteins proteins. 


BL00678 9.67 2.800e-10 366-377 BIj00678 
9.67 5.263e-09 417-428 BL006789.67 
6.211e-09 186-197 


111 


PR00308 


TYPE I ANTIFREEZE PROTEIN 
SIGNATURE 


PR00308C 3.83 8.892e-10 108-118 
PR00308C 3.83 8.892e-10 109-119 
PR00308C 3.83 8.364e-09 107-117 


111 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320A 16.74 4.000e-13 364-379 
PR00320B 12.19 7.923e-12 415-430 
PR00320A 16.74 5.966e-ll 415-430 
PR00320C 13.01 7.214e-l 1 415-430 
PR00320C 13.01 9.217e-ll 364-379 
PR00320A 16.74 9.690e-ll 184-199 
PR00320B 1Z19 3.057e-10 184-199 
PR00320C 13.01 6.040e-10 184-199 
rKuOizUo IzAy O.OD /e-lU Jo4o79 
PR00320B 12.19 1.450e-09 457^72 

PR00320A 16.74 4.732e-09 457-472 

PR00320C 13.01 l.OOOe-08 281-296 


1 12 




SHADOW GLOBAL. 


DM00547C 17.30 7.000e-19 23-45 
DM00547E 13 94 5 154e-17 135-158 
DM00547D 11.60 2.750e-13 105-119 


112 


BL00315 


Dehydrins proteins. 


BL00315A9.35 4.246e-10 1301-1329 


112 


PF00426 


Outer Capsid protein VP4 
(Hemagglutinin). 


PF00426S 15.67 6.438e-10 1271-1309 


112 


BL00039 


DEAD-box subfamily ATP-dependent 
helicases proteins. 


BL00039D 21.67 6.793e-10 368-414 


112 


PD02191 


I ATP-BINDING NUCLEOSIDE 
TRANSCR. 


PD02191A 1355 9.036e-10 107-122 


112 


BL0004g 


Protamine PI proteins. 


BIj00048 6.39 1.900e-09 1257-1284 
BL00048 6.39 S.050e-09 1258-1285 




Jrx'UU/ /4 


Dihydropyridine sensitive L^type 
calcium channel (Beta subuni. 


UfAATTyl A 1 ^ Al 1 OAa AO 1 OOA 1 OliT 

rrU0774A 10.4/ 7.13Ue-U9 12oU-132o 
PF00774A 16.47 7.730e-09 1276-1322 


112 




Pi ilf jiTVrttio D'N^A firtlvmprsicp TT 

heptapq>tide repeat proteins.. 


BT 1)01 1 S7 ^ 12 3 44Re-1 1 1 254-1 303 
BIJ00115Z 3.12 3.302e-10 1261-1310 
BL00115Z 3.12 4.837e-10 1258-1307 
BL00115Z 3.12 7.767e-10 1251-1300 
BL00115Z 3.12 8.167e-10 1263-1312 
BL00115Z 3.12 8.884e-10 1260-1309 09 
1247-1296 BLOOl 15Z 3.12 2.985e4)9 1240- 
1289 BL00115Z 3.12 5.6326-09 1265-1314 
BL00115Z 3.12 8.676e-09 1253-1302 
BL00115Z 3.12 9.471e-09 1268-1317 
BIj00115Z 3.12 9.735e-09 1257-1306 


112 


PF00186 


Flocculin r^eat proteins. 


PFOOl 861 9.10 5.290e-13 1279-1309 
PF001861 9.10 6.838e-12 1277-1307 
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PF001861 9.10 2.9576-11 1282-1312 
PF001861 9.10 7.4966-11 1276-1306 
PF001861 9.10 5.2006-10 1268-1298 
PF001861 9.10 7.450e-10 1278-1308 
PF001861 9.10 7.450e-10 1280-1310 
PFOOl 861 9.10 4.543e-09 1266-1296 
PFOOl 861 9.10 5.252e-09 1285-1315 
PF001861 9.10 6.0316-09 1272-1302 
PF00186I9.10 6.1026-09 1274-1304 
rru01o61 9.10 7.23oe-09 1270-1300 

D17AA1 OiCf O 1 A O iVX £.t% AO 1 1^1 1 OOl 

PrUUlODl y.lU o.Uloe-Uy X2ol-127l 

PF00186I9.10 9.433e-09 1267-1297 
x^ruuiooi y.iu i.uuue-uo xzoo-izoo 


114 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 8.788e-ll 237-256 


1 1d 

I i*r 




Tyrosine specific protein pho^hatases 
proteins. 


j3L003o3£ 10.35 0.3276-10 240-251 


116 


JrlvVUOOH 


jsjLD\JO\JiyiI\i^ jTmSS^ lloO 




117 


PD02890 


T550MFRASF THAT jPONR— 

XOwlVXX^lVnOX^ I^XXTYXj^^^Vyi^Er^" 

FLA VONONE FLAV. 




118 


BL00226 


Intennediate filaments proteins. 


BL00226B 23.86 6.5 13e-10 401^9 


118 


BL00326 


Tropomyosins proteins. 


BL00326D 8.76 l,925e-09 196-237 


118 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 2.678e-09 328-382 

DlAJl lOUD o.y^Ze-Ui^ 034-/Uo 


119 


PD01823 


PROTEIN INTERGENIC REGION 
ABCl PRFPTIRSOR 

MITOCHONDRION T. 


PD01823C 16.13 7.000e-14 352-373 
PD01823D 16.66 6.857e-10 430-451 


119 


pnoi lis 


SIGNAL. 


PFkAl 1 1 10 OO fi /101a no OAfi OfiO 
rXJUl i 1D£> 0.4Zie*-U7 Z0o-Zo2 


122 


BIj008S4 


Proteasome B-type subunits proteins. 


BL00854C 29.92 8.43S6-19 114-143 


124 


BL00651 




RT nft#\^1 A 01 0^ ft ATT** lO OA 11A 


125 


BL01245 


proteins. 


RT m OA^F 1 ft 7^ 0 171*» 01 11A_171 

BL01245A 14.04 8.342e^23 206-231 
BL01245C 1331 6.564e-15 262-282 

BL01245B 11,91 9.8096-10 245-255 


128 


PR00793 


PROLYL AMINOPEPTIDASE (S33) 
FAMILY SIGNATURE 


PR00793C 12.24 1.333e-09 168-183 


128 


PROOlll 


ALPHA/BETA HYDROLASE FOLD 
SIGNATURE 


PROOlllC 13.46 6.000e-09 182-196 


129 


BL01160 


Kinesin light chain repeat proteins. 


BL01160D 10.17 7.077e-09 505-534 


129 


PD00126 


PROTEIN REPEAT DOMAIN TPR 
NUCLEA. 


PD00126A 22.53 1.0006-08 695-716 


130 


BL00355 


HMG14 and HMG17 proteins. 


BL00355 5.97 8.412e.32 18-49 


130 


PR0092S 


NONHISTONE CHROMOSOMAL 
PROTEIN HMQ17 FAMILY 
SIGNATURE 


PR00925B 3.73 3.400e-16 34-47 PR00925A 
5,47 1.7506-15 18-33 PR00925C5.57 
9.8246-09 51-62 


131 


PR00041 


CAMP RESPONSE ELEMENT 


PR00041E 7.20 2.976e-13 305-326 
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BINDING (CREB) PROTEIN 
SIGNATURE 




131 


BL00036 


bZIP transcription Actors basic domain 
proteins. 


BL00036 9.02 4.103e-09 299-312 


132 


PR00211 


GLUTELIN SIGNATURE 


PR00211B 0.86 1.750e-09 205-226 
PR00211B 0.86 8.750e-09 199-220 


132 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 4.529e-ll 201-234 
LIMUUZ13 ly.Hj i.owe-iu ly^-zzo 

L/iViUvZiJ i^.HJ HcVi^He-iU ZUZ-Z3^ 

DM00215 19.43 6.304e-10 207-240 
DM00215 10 4*^ 7 450p-10 IRO-^l'^ 
DM00215 19.43 8.393e.lO 196-229 
DM00215 19.43 8.7 14e-10 218-251 
DM00215 43 6 034e-09 1 85-21 R 
DM00215 19.43 6.034e-09 219-252 
DM00215 19.43 6.492e.09 223-256 
DM00215 19.43 7.254e-09 200-233 
DM00215 19.43 9.390e-09 189-222 
DM00215 19.43 9.695e-09 213-246 


133 


BL00455 


Putative AMP -binding domain proteins. 


BL00455 13.31 5.125e-ll 293-309 


133 


PRODI 54 


AMP-BINDING SIGNATURE 


PR00154A 8.88 6,276e-09 286-298 


136 


PD00015 


GLYCOPROTEIN PRECURSOR 
CELL SI. 


PD00015A 8.90 6.400e-09 243-251 


138 


BL00227 


Tubulin subunits alpha, beta, and 
gamma proteins. 


BL00227B 19.29 l.OOOe^O 52-107 
BL00227C 25.48 l.OOOe^O 113-165 
BL00227A 24.55 8.200e-36 1-35 


140 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.377e-13 60-75 PR00049D 
0.00 7.500e-10 63-78 PR00049DO.0O 
8.071e-10 61-76 


140 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4 J28 8.440e-09 68-82 


140 


BIJ00904 


Protein prenyltransferases alpha subunit 
repeat proteins proteins. 


BL00904A 8.30 9,553e-09 60-110 


141 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 6.438e-12 1175-1190 


141 


BL01187 


Calcium-binding EGF-like domain 
proteins pattern proteins. 


BL01187B 12.04 5.800e-ll 1284-1300 
BL01187B 12.04 8.200e-ll 180-196 


141 


BL01248 


Laminin-type EGF-like (LE) domain 
proteins. 


BL01248 11.02 4.343e-12 1362-1375 
BL01248 11.02 2.350e-ll 322-335 BL01248 
11.02 4.1256-10271-284 


141 


PR00764 


COMPLEMENT C9 SIGNATURE 


PR00764B 13.56 3.475e-09 1047-1068 


141 


PROOOlO 


TYPE n EGF-LIKE SIGNATURE 


PROOOIOC 11.164.205e-09 185-196 


141 


BL01113 


Clq domain proteins. 


BL01113A 17.99 5.673e-09 1621-1210 


141 


PROOOll 


TYPE m EGF-LIKE SIGNATURE 


PROOOllD 14.03 8.895e-12 551-132 
PROOOllB 13.08 5.846e-ll 551-132 
PROOOllD 14.03 3.215e-10 313-332 
PROOOllA 14.06 4.214e-10 313-332 
PROOOllB 13.08 7.783e-10 313-332 
PROOOllA 14.06 7.781e-09 551-132 


141 


BL00420 


Speract receptor repeat proteins domain 


BL00420A 20.42 8.200e-09 1186-1215 
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141 


PD02510 


ISOMERASE GALACTOSE-6- 


PD02510B 18.31 8.170e-09 548-144 


141 


PR00261 


LOW DENSITY LIPOPROTEIN 
(LDL) RECEPTOR SIGNATURE 


PR00261F 1 1.57 9.544e-09 1052-1074 


141 


PR00288 


PUROTHIONIN SIGNATURE 


PR00288C 10.15 9. 165e-09 311-326 


142 


DM01970 


01cwZK632.12YDR313C 
ENDOSOMAL lU. 


DM01970B 8.60 4.750e-17 114-565 


142 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 2.373e-09 203-257 


142 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 4.000e-09 559-130 


142 


BL00422 


Granins proteins. 


BL00422E 26.86 8.6156-09 462-498 


143 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 5.846e-15 141-154 PD00066 
13.92 9.217e-ll 551-564 PD00066 13.92 

6.700e>09 523-536 


143 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 9.526e-ll 122-136 
PR00048A 10.52 2.174e-10 532-546 
PR00048A 10.52 6.087e-10 588-164 
PR00048B 6.02 7.632e-09 138-148 
PR00048A 10.52 8.920e-09 504-518 


143 


PF00651 


BTB (also known as BR-C/Ttk) domain 
proteins. 


PF00651 15.00 8.920e-09 59-72 


143 


BL00028 


Zmc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 7.577e-ll 535-114 BL00028 
16.07 2.200e-10 125-142 BL00028 16.07 
5.800e-10 507-524 BLOO028 16.07 8.714e- 
09 591-170 BL00028 16.07 9.743e-09 444- 
461 


144 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 3.672e-10 262-285 


1 AA 

144 


"Dl AA01 C 


Mitochondrial energy transfer proteins. 


BIj00215A 15.82 7.900e-15 16-41 
BL00215A 15.82 8.147e-14 260-285 
BLuU215A 15.82 1.804e-09 166-191 
BL00215B 10,44 5.500e-09 1 14-127 


144 




A T4I7KTTKTC XTT T/^T TlfWXWC 

TRANSLOCATOR 1 SIGNATURE 


PR0U927B 14.00 8.o44e-09 104-126 


14/ 


LIMUi4i / 


O KW INLIUOirsO ArMCZ 

MUSHROOM SPAC22G7.04. 


UMU141 /C IZ.yi 3.2jOe-ll 2(fi-Zly 
DM01417D 1 1.08 2.200e-10 306-322 


148 








151 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.807e-ll 419-434 
PR00049D0.00 8.125e-ll 1284-1299 
PR00049D 0.00 3.929e-10 1283-1298 
PR00049D 0.00 3.2886-09 417-432 


151 


BL00904 


Protein prenyltransferases alpha subunit 
repeat proteins proteins. 


BIj00904A 8.30 3.553e-09 416-466 


154 


BL0066S 


Dihydrodipicolinate synthetase 
proteins. 


BL00665D 14.76 l.OOOe-11 109-132 
BIj00665C 25.58 5.832e-ll 50-101 


154 


PR00146 


DIHYDRODIPICOLINATE 
SYNTHASE SIGNATURE 


PR00146D 16.26 2.525e-10 108-126 
PR00146A 12.62 8.615e-09 13-35 


156 


PD02906 


SYNTHASE I PSEUDOURIDYLATE 
PSEUDOURIDINE LYASE TR. 


PD02906C 24.17 9.1 15e.l5 171-206 
PD02906B 15.35 4.886e-13 142-155 
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PD02906D 12.27 LOOOe-09 239-249 
PD02906A 10.84 8.333e-09 92-105 


157 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107B 13.31 2.286e-ll 396-412 
BL00107A 18.39 6.148e-ll 332-363 


157 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 4.938e-09 332-351 


160 


PF01008 


Initiation factor 2 subunit 


PF01008B 25,59 9.171e-36 366-409 
PF01008A 20.14 8.676e-12 315-336 
PF01008C 12.25 7.382e- 10 449-469 


161 


BL00591 


Glycosyl hydrolases family 10 proteins. 


BL00591D 8.33 6.167e-09 2099-2112 


163 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 7.120e-09 99-113 
PR00019B 11.36 7.840e-09 73-87 


164 


BL00198 


Nt-dnaJ domain proteins. 


BL00198A 8.07 3.000e-14 143-160 


164 


PR00187 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00187A 12.84 8.800e-12 139-159 


165 


PR00310 


ANTI-PROLIFERATIVB PROTEIN 
BTGl FAMILY SIGNATURE 


PR00310B 10.59 4,000e.39 41-71 
PR00310C 12.74 2.256e-33 71-101 
PR00310D9.10 9,820e-33 101-131 
PR00310A 11.17 7.000e-27 16^1 


165 


BL00960 


BTGl &mily proteins. 


BL00960B 24.47 l.OOOe-40 34-79 
BL00960C 12.68 6.745e-21 98-120 
BL00960A 10.98 5.304e-12 14-26 


166 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 2.688e-21 124-174 


166 


DM00973 


3 kw RESISTANCE BENOMYL 
YLL028 W CYCLOHEXIMIDE, 


DM00973A 21.17 4.1626-10 96-133 


166 


PR00171 


SUGAR TRANSPORTER 
SIGNATURE 


PR00171D 12,76 3.5206-13 456^78 
PR00171E 14.87 2.750e-09 479-492 


166 


PR00172 


GLUCOSE TRANSPORTER 
SIGNATURE 


PR00172D 9113 6,513e-09 456-480 
BL00216B 27.64 5.198e-20 124-174 


167 


BL00216 


Sugar transport proteins. 




167 


DM00973 


3 kw RESISTANCE BENOMYL 
YLL028W CYCLOHEXIMIDE. 


DM00973A 21.17 4.162e-10 96-133 


168 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 5.929©-32 59-98 


168 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 2.385e-15 520-533 PD00066 
13.92 2.800e-14 296-309 PD00066 13.92 
5.2006-14 240-253 PD00066 13.92 5.200e- 
14 548-561 PD00066 13.92 9.400e-14 436- 
449 PD00066 13.92 l.OOOe-13 324-337 
PD00066 13.92 6.143e-12 352-365 PD00066 
13.92 6.885e-10 268-281 


168 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048B 6.02 6.000e-12 237-247 
PR00048A 10.52 6.294e-12 333-347 
PR00048A 10.52 6.824e-12 361-375 
PR00048A 10.52 9.471e-12 249-263 
PR00048A 10,52 4.3166-11 119-133 
PR00048A 10.52 4.789e-ll 529-543 
PR00048A 10.52 6.684e-ll 445-459 
PR00048A 10.52 8.141e-ll 305-319 
PR00048B 6.02 6.063e-10 321-331 



201 



wo 02/081731 



PCT/US02/01222 



Tables 



SEQID 


Database 
emry ui 


Description 


^Results 








PR00048B 6.02 6.063e-10 517-527 
rKU(Xl4oA lU.DZ 7.2oie-lU 

PR00048B 6.02 7.750e-10 545-117 

Xi^c\r\r\AQn AO 1 AiAe^ AO OQi 

r KUUU4or> O.Uz 1 .4 /hC-W J.yj'jUj 

PR00048A 10.52 l.OOOe-08 417-431 


1 /v 




•STGNATIJRF 




170 


PD02331 


CYCLIN CELL CYCLE DIVISION 
PROTE. 


PD02331A 19.76 7.429e-15 93-140 
PD02331B 13.43 l.l25e-09 174-207 


170 


PR00833 


POLLEN ALLERGEN POA PI 

<2Tr?MATT TDP 
olOrNAlUKJi 


PR00833H 2.30 5.269e-09 3-18 


171 


PD00126 


PROTEIN REPEAT DOMAIN TPR 
NUCLEA. 


PD00126A 22.53 4.706e-14 140-161 
PD0O126A 22.53 6.824e-14 289-310 


173 


BL00741 


Guanine-nucleotide dissociation 
snmuiators CUCz4 raniily sign. 


BL00741B 14^7 3.418e-ll 294-317 


173 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 5.154e-ll 86-102 


I IS 




NEUTROPHIL CYTOSOL FACTOR 
P40 SIGNATURE 


PR00497D 11.91 5.962e-10 91-113 


173 


PF00564 


Octicosapeptide repeat proteins. 


PF00564B 24.74 6.442e-09 277-328 


175 


BL01016 


Glycoprotease family proteins. 


BL01016C 22.84 5.292e-19 60-105 
BL01016H 13.71 6.157e-12 307-317 
BL01016E 14.88 3.182e-ll 141-169 
BL01016D 8.86 6.741e-09 118-131 


175 


PR00789 


O-SIALOGLYCOPROTEIN 
METALLO-PROTEASE FAMILY 

QTmVr A TT Tl> T7 
olVJiN A 1 U Jvb 


PR00789E 12,42 7.128e-14 141-163 
PK00789C 16.1 1 2.707e-12 85-105 
PR00789B 10.48 1.205e-09 64-85 

rKUU/oyD 6.17 7.15 le-09 118-131 


176 


PR00850 


GLYCOSYL HYDROLASE FAMILY 
59 SIGNATURE 


PRp0850B 6.67 5.455e-09 148-173 


178 


PR00259 


TRANSMEMBRANE FOUR FAMILY 


PR00259A 9.27 8.676e.20 17-41 PR00259C 
10.40 4.750e-17 85-114 PR00259B 14.81 
8.615e-12 58-85 PR00259D 13.502J28e-ll 
235-262 


178 


BIJ00421 


Transmembrane 4 family proteins. 


BL00421B 17.62 6.186e-17 64-103 
BL00421A 11.796.800e-12 13-32 
BL00421E 20.97 1. 5 14e-10 232-262 


178 


PR00235 


HERPESVIRUS MAJOR CAPSID 
PROTEIN (MCP) SIGNATURE 


PR0023SA 14.64 8.000e-09 87-111 


179 


BIJ01052 


Cal^onin family repeat proteins. 


BL01052C 18.51 6.806e-40 87-127 
BL01052A 16.12 7.180e-32 3-35 BL01052B 
15.31 8.031e-26 52-78 BL01052D 10.26 
l.OOOe-24 174-194 


179 


PR00890 


SMOOTH MUSCLE PROTEIN 22- 
ALPHA (TRANSGELIN) 
SIGNATURE 


PR00890E 14.34 3.8 13e-21 135-155 
PR00890A 8.61 9.775e-21 34-54 PR00890C 
8.22 l.OOOe-l? 84-98 PR00890B 8.75 
3.455e-17 62-78 PR00890F 12.92 4.064e-14 
161-174 PR00890D 16.17 5.174e.l3 118- 
128 



202 
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Tables 



NO: 


entry ID 


Lfescripnon 


'^Kesuns 




JrKUUooo 


PROTEIN/CALPONIN FAMILY 
SIGNATURE 


rKOUsoeri y.yv 5-154e-zU 1 Ij-iyi 
PR00888C 12.27 5.179e-18 52-68 
PR00888D 16.094.2736-1788-105 
PR00888A 11.87 2.350e-16 3-18 PR00888E 
ll.ol J.43^e-io 104- I/O rKOOooor /.44 
4.oz5e-14 rKOOcSooO Xz./j o./we- 

14 162-176 PR00888B 13.722.350e-12 22" 
36 


179 


PR00889 


CALPONIN SIGNATURE 


PR00889E 12.18 2.726e-12 171-187 


ioU 


13T AAOTC 

r>LUUo/D 


Bacteiial type n secretion systm 
protein D proteins. 


BL00875A 25.57 6.447e-09 367-399 


101 

lol 


rD0135I 


PROTEIN REPEAT 
NEUROFILAMENT TRIPL. 


PD01351B 13.72 5355e-09 238-264 


l52 


DM01354 


Kw TRANSCRIPTASE REVERSE II 
0RF2. 


DM01354H 18.00 8.826e-27 109-149 
DM01354G 11.57 2.143e-25 78-109 
DM01 3 54F 14.56 1.414e-15 42-78 
DM01354E 18.69 8.650e-14 17-47 






Renal dipeptidase proteins. 


BL008o9D 14.02 3.477e-09 67-96 


185 


BL00039 


DEAD-box sub&mily AlT-dependent 
helicases proteins. 


BL00039A 18.44 4,000e-25 222-261 
BL00039D 21.67 4.529e-23 498-544 
BL00039C 15.63 4.300e-16 347-371 

or AAAIOD 10 1 A O OTAa 1 C liC) OOD 

oLUUUjyU ly,iy y.37ye-lj 2o/-2oo 


185 


PD00302 


PROTEASE POLYPROTEIN 


PD00302B 9.52 1.346e-09 234-250 


186 


PD00066 


PROTEIN ZINC-FINGER METAL- 


PD00066 13.92 5.714e-12 152-165 PD00066 

1 ^ 07 1 yl7o 17 1 OA 1 

O.i'lJe-1/ 1/4-1 J/ 


186 


BL00028 


Zinc finger, C2H2 type, domain 

proteins. 


BL00028 16.07 6.88Se-ll 136-153 BL00028 
16.07 2.200e-10 197-214 






TERMINAL TAIL SIGNATURE 


rKUOzjyil l.^o D. /03e-0y 4z0-43z 


too 


A IvUUUHO 


r*7W7 TVPP 7TKrr* TTTKmPP 

SIGNATURE 


ppnnriiiQA ia <o o oc7<> ia 1^1 i>i*7 
rKUUWoA lU.JZ z.y3/e-lU 133-147 

PR00048A 10.52 3.739e-10 194-208 
pPAon^fiA inc7Rn>i'5o ini/?i i7< 

l'JvUUU*loA lU.OZ O.U4je-lU lOl-l/D 

PR00048B 6.02 8.105e-09 121-131 




RT/^in77 


i^iA^ lamiiy proion/ougopcpiiuc 
symporters proteins. 


HT ni n77"n 77 i€\ a o^Aa 1 a ^ac ika 


187 




TMHTRrW AT PHA PP ATM 

SIGNATURE 




190 


PR0083O 


ENDOPEPTIDASE LA (LON) 
SERINE PROTEASE (S16) 
SIGNATURE 


PR00830A 8.41 3 J42e-09 881-901 


191 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 9.234e-13 261-280 


191 


BL00107 


Protein kinases AlP-binding region 
proteins. 


BL00107A 18.39 l.OOOe-23 261-292 
BL00107B 13.31 l.OOOe-12 341-357 


191 


BL00239 


Receptor tyrosine kinase class n 
proteins. 


BL00239B 25.15 6.523e-10 196-244 


191 


BL00479 


Phorbol esters / diacylglycerol binding 
domain proteins 


BL00479C 12.01 l.OOOe-09 320-333 


191 


PR00834 


HTRA/DEGQ PROTEASE FAMILY 


PR00834F 10.91 2.946e-09 786-799 



203 
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SEQID 
NO: 


Database 
entry n) 


Description 


^Results 






SIGNATURE 




193 


BL01033 


Globins profile. 


BL01033A 16.94 2.385e-18 25-47 






SIGNATURE 


PROORMA 1? QA 1 000^-92 *^ft-A7 
PR00814B 9.18 7.750e-18 48-64 




PR AO 7 7^ 

XlWJUl iJ 












PP 00*^90 A 16 7A A Q71p-in lAO-lSS 
PR00320C 13.01 9.280e-10 140-155 


1Q4 






RT 0067R Q 67 7 6'^7f-0Q 149-1 


196 


PR00832 


PAXDLLIN SIGNATURE 


PR00832B 9.87 9.174e-10 309-333 


196 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 2.054e-10 376-430 
BL01160B 19.54 6.919e-10 383-437 


196 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.780e-09 40-55 


196 


BL00087 


Copper/Zinc superoxide dismutase 
proteins. 


BL00087C 20.18 8.784e-09 260-296 




rROOoOo 


\ 7TVTOT TT TXT O TOVT A *TT TT» C 

VINCUUN blljNAl URc 


PROOoOoA 6.63 9.014e-09 308-319 






Tropomyosins proteins. 


"DT AAOliC A 1 /I A1 A 1 vI'Sa AA CAjC CilA 

bLUUiZoA 14.01 y.l43e-uy 50O-D40 


197 


PR00674 


LIGHT HARVESTING PROTEIN B 

/^TT A TKT CinXT A TT TD C 


PR00674A 20.10 7.391e-09 134-155 






r-AC 1 UN CArrlNO rKU 1 iilN Bil 1 A 
SUBUNIT SIGNATURE 


rKUOlVzC o.o5 z.5UUe-3o 57-o4 i'RuOiyzU 
8.23 4.462e-36 97-125 PR00192E 8.85 

7 nOOi» 'W 919 9*^0 PPAA1Q9 A fi 9^ 1 A7Aa 
97 96 PP00109R ^ 90 OHOp* 96 96^ft 






T7»si/*lin f^sirmino f^mfAin l^^to ciiViimtt 
wappillj^ pZvlCill C/Cla oUI/UIlll 

proteins. 


RT/)09'? 1 A R SO 1 OOOp^O S-S1 RT 009^1R 

14,16 l.OOOe-40 84-128 BL00231D 15.40 
LOOOe-40 165-200 BLO0231E 11.66 l.OOOe- 
40 209-246 BIj00231C 12 77 1 180e.l5 146- 
157 


199 


PF00023 


Anlf reoeat nrotein*? 


PF00023A 16 03 4 75Oe-10 45-61 


199 


PF00791 


Domain present in ZO-1 and Unc5-like 
netrin recentors 


PF00791B 28.49 8.768e-12 87-142 
PF00791B 28 49 7 028e-09 499-1 16 


199 


BL01160 


Kinesin light chain repeat proteins. 


BL01160E 8.74 7.398e-09 323-362 


201 


PR00239 


MOLLUSCAN RHODOPSIN C 
TERMINAL TAIL SIGNATURE 


PR00239F 1 58 6 114e-09 183-195 


202 


BL00412 


Neuromodulin (GAP-43) proteins. 


BL00412D 16.54 4.033e-10 319-370 


202 


BL00224 


Clathrin light chain proteins. 


BL00224B 16.94 4.845e-09 313-366 


202 


PF00992 


Troponin. 


PF00992A 16.67 8.734e-12 333-368 
PF00992A 16.67 2.776e-09 344-379 
PF00992A 16.67 5.026e-09 351-386 


203 


BL00790 


Receptor tyrosine kinase class V 
proteins. 


BL00790R 16:20 7.677e-09 29-73 


204 


BL00790 


Receptor tyrosine kinase class V 
proteins. 


BIj00790R 16.20 7.677e-09 29-73 


205 


BL00790 


Receptor tyrosine kinase class V 
proteins. 


BL00790R 16.20 7.677e-Q9 29-73 


207 


BL00211 


ABC transporters family proteins. 


BL00211B 13.37 3.077e-17 573-167 
BL00211B 13.37 7.577e-17 1204-1674 



204 



wo 02/081731 



PCTAJS02/01222 



Tables 



NO* 


If alaDasc 

Atl#l*V 111 


uescripiion 


Kesuiis 








BL00211A 12.23 1.900e-09 472^84 




JrJ\.uUH /o 


SIGNATURE 


D1?AA/l'7fiA n AA A '111aJ\Q AHA-AQO 






ocKUM AJJdUMIN rAMlLY 
SIGNATURE 


ODAADAI/l ^A Cn n 1 OQa AO 0*71 AAil 

PKUUoU2Cj 14.57 /.looe-UV S?71-yy4 




rKUUo^O 


O /^A if A TfYm i^TkTKT TT/^T> Tl jf /^VTC 

oUMAlUlKUrlN xlOKMONb 
FAMILY SIGNATURE 


PKUUojoD 13.05 7.125e-Uy 1504-151y 




UU AAA/1 Q 


SIGNATURE 


DDAAA/IOT^ A AA 1 'TOjCb 1 A OOO OAO 

rKUUlwyu U.OU 1.7oDe-lU2ooo03 


210 


BL00972 


Ubiquitm carboxyl-tenninal hydrolases 
family 2 proteins. 


BL00972D 22.55 3.348e-ll 388-413 
BLu0972E 20.72 4.343e-09 415-437 


210 


PR00198 


ANNEXDSr TYPE H SIGNATURE 


PR00198H 12.05 7.750e-09 682-696 


214 


PD00469 


PROTEIN PRECURSOR SIGNAL 
HYDROLA. 


PD00469A 13.95 6.400e-09 73-86 


215 


PF00023 


Auk repeat proteins. 


PF00023A 16.03 8,875e-10 839-855 
PF00023A 16.03 2.286e-09 884-900 


215 


PR00342 


RHESUS BLOOD GROUP PROTEIN 
SIGNATURE 


PR00342H 7.61 9.703e-09 317-340 


217 


BL00982 


Bactenal-type phytoene dehydrogenase 
proteins. 


BL00982A 18.41 8.013e-12 328-360 


217 


PR00368 


FAD-DEPENDENT PYRIDINE 
NUCLEOTIDE REDUCTASE 

dICjNAI ure 


PR00368C 15.74 8.962e-ll 326-352 


217 


PR00469 


PYRIDINE NUCLEOTIDE 
DISULPHIDE REDUCTASE CLASS- 
n SIGNATURE 


PR004691 13.83 7.532e-ll 449-468 
PR004o9F 16.51 7.152e-09 322-347 






IKUN-dULt UK bLcC I RON 
TRANSPORT AROMATIC 
Jtl I JJKOCAKd. 


PD0Z042B 16.75 S.o73e-09 126-141 
PD02042A 21.13 9.045e-09 93-120 


217 


PR00419 


ADRENODOXIN RHDUCTASE 

17 A A>f TT V QinXT A TT TD T7 


PR00419A 14.89 9.486e-09 326-349 


218 


PF00157 


PDZ domain proteins (Also known as 
DHRorGLGF). 


PF00157 13.40 4.600e-09 688-699 


219 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 7.000e-23 65-96 
BLOOIOTB 13.31 4214€-10 130-146 


219 


PR00109 


TYROSINE KINASE CATALYTIC 

T^/^Xif A TXT CT/^VT A TT T¥>T3 

DOMAIN SIGNATURE 


PR00109B 12.27 7.102e-10 65-«4 






jvcccpior ^Tosine Kuu»e class ill 
proteins. 


nr nno^np 11 < o^Oo-riQ <i bo 


220 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 3.045e-09 38-50 


220 


DM01803 


1 HERPESVIRUS GLYCOPROTEIN 
H. 


DM01803A 10.51 9.349e-09 34-55 


220 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D0.00 5.160e-ll 40-55 PR00049D 
0.00 7.807e-ll 41-56 PR00049D 0.00 
8.336e-ll 38-53 PR00049D 0.00 2.286e-10 
42-57 PR00049D 0.00 8.857e-10 33-48 
PR00049D 0.00 2.983e-09 37-52 PR00049D 
0.00 9.847e-09 43-58 


222 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 5.337e-10 825-859 



205 
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SEQm 


Database 

f^ntrv 111 
CUftI J JLLr 


Description 


^Results 


222 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 9.924e-09 516-132 




•QT nfM7R 


LulVfX OUuMalU. pXVilCJlla. 




226 


BL00048 


Protamine PI proteins. 


BIJ00048 6.39 6.063e-09 199-226 




TJT AOl 1 « 


IBuksryotxc RNA polymerase II 
lieptapeptide repeat proteins. 




228 


BL00415 


Synapsins proteins. 


BL00415Q 2.23 8.723e-09 253-289 






Glucosamine/ galactosamine-6- 
phosphate isomerases proteins. 


DT m K1 A 1 Q A7 1 r\f\(\a^f\ 17 77 

BL01161D 28.14 l.OOOe-40 199-244 

tjr ni 71 17 ^ AOlo IQ 117 1 Aft 
IjJLAil iOi^ 10.4/ i.JUUC"Z«3 1/U*l72r 


xji 




PLmOTROPHIN/MIDKINE FAMILY 
SIGNATURE 


PPftn7/;OA 11 01 1 tll«k in 1 11 


231 


BLOW 81 


PTN/MK heparin-binding protein 
family proteins. 


BL00181A 19.07 4.960e-37.76-l 12 

rSlAJUloiA ly.U/ y./Z4e-lo /0-114 


236 


BL00888 


Cyclic nucleotide-binding domain 
proteins. 


BL00888B 14.79 9.069e-I3 499-523 


236 


BL00415 


Synapsins proteins. 


BL00415N 4.29 2.774e-09 733-777 


236 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSORRE. 


PD00306A 10.26 3.I33e-09 646-660 


236 


PR00209 


ALPHA/BETA GUADIN FAMILY 

SIGNATURE 


FR0Q209B 4.88 3.813e-09 739-758 


236 


DM00668 


ZEIN. 


DM00668A 10.20 8.500e-09 258-273 


238 


BL01188 


GNS1/SUR4 family proteins. 


BL01188B 13.464.115e-26 120-151 






BL01188C 22.65 4.136e-26 151-202 
BL01188D 8.62 i.290e-ll 238-255 
BL01188A 18.82 6.718e-10 55-87 


239 


PR00929 


AT--HOOK-.LIKE DOMAIN 

SIGNATURE 


PR00929B 4.38 8.875e^ 133-583 
PR00929C 5.26 8.914e-09 133-144 


242 


BIj00232 


Cadherms extraceUular repeat proteins 
domain proteins. 


BL00232B 32.79 2.765e-25 541-151 
BL00232B 32.79 8.263e-22 766-814 
BL00232B 32.79 2.397e-21 67-115 
BL00232B 32.79 4.133e-19 1481-1529 
BL00232B 32.79 l.OOOe-18 1371-1419 
BL00232B 32.79 2.662e- 18 1691-1739 
BL00232B 32.79 5.292e-18 1287-1335 
BL00232B 32.79 9.147e-18 1148-1196 
BL00232B 32.79 1.265e-17 980-1028 

RT Aft717R 17 70 1 •^7Q<» 1 7 AlfL^IA 

BIj00232B 32.79 2.588e-17 1084-1132 
BL00232B 32.79 1.386e-16 1184-1232 
BL00232C 10.65 5.390e-12 1369-1387 
BL00232C 10.65 1.391e-ll 204-660 
BL00232C 10.65 2.174e-ll 1584-1164 
BL00232C 10.65 4.522e-ll 1689-1707 
BL00232C 10.65 l.OOOe-10 65-83 
BL00232C 10.65 4.115e-10 1285-1303 
BL00232B 32.79 7.200e-10 649-697 
BL00232C 10.65 9.827e-10 978-996 
BL00232C 10.65 1.947e-09 170-188 
Blj00232B 32.792.137e-09 172-220 



206 
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SEQID 

iH%Jl 


Database 
6iicry JLLf 


DescriptioD 


^Results 








BL00232C 10.65 4.474e-09 1182-1200 
BL00212C 10 65 8 737e-09 539-119 


243 


BIJ00795 


Involuciin proteins. 


BL00795C 17.06 4.9776-10 64-109 
BY 00795C 17 06 6 300e-09 55-100 


244 


BL00790 


Receptor tyrosine kinase class V 
proieins. 


BIj00790I 20.01 7.823e-15 23-54 BL00790I 
20 01 Q400e-11 310-341 BL00790I20 01 
1.900e-10 117-148 BL007901 20.01 3.893e- 
09215-246 


244 


PR00014 


FIBRONECTIN TYPE ffl REPEAT 
olvrM A i UKc 


PR00014D 12.04 6.400e-ll 30^5 
PR00014C 15.44 9.171e-09 204-223 


245 




- — , . ■ 7 : ' — ■ 

Ubiquitiii-conjugEting enzymes 

proteins. 




246 


PR00019 


LEUCINE-RICH REPEAT 


PR00019A 11.19 8.800e-12 205-219 

PP 0001 OR 11 ^li 9 000p.11 909.91^ 


247 


BIJ00214 


Cytosolic fatty-acid binding proteins. 


BIJ00214B 26.51 7.180e-24 206-251 
BL00214A 21.17 6.2506-22 165-191 


247 


PR00178 


FATTY ACID-BINDING PROTEIN 

olVJiN A 1 UKC 


PR00178A 15.07 4.913e-2l 166-187 

PP0017JIP 90 ^ 9 SOOp-1 7 99f^-9^4 

PR00178D 13.52 6.897e-l 6 272-291 
PR00178B 10.52 4.900e-l 0200-212 






SIGNATURE 


PPOO'^O^P 16 17 9 047^.13 4fwM 


248 


BL00962 


Ribosomal protein S2 proteins. 


BL00962C 15.90 2.846e-12 46-64 






iUDuun suDuniis aipnia, ocia, ana 
ganuna proteins. 


RTri0997n 46 1 OOOp^n 74-198 

BL00227F21.16 1.529e-33 226-280 

B1j00227E 24 15 1 409e-26 178-213 






lUUlUUl oUDUlULo cUpilay UCUl^ dllU 
KAUAUia jJUJiviiia. 


BIj00227C 25 48 1 OOOe-40 39-91 
BL00227D 18.46 l.OOOe-40 148-202 
BL00227F 21.16 1.529e-33 300-354 
BL00227E 24.15 1.409e-26 252-287 


251 


BL00152 


ATP synthase alpha and beta subunits 
proteins. 


BL00152B 21.40 1.900e-31 191-229 
BL00152A 15.38 5.154e-21 134-160 
BL00152C 11-41 6.250e-12 291-303 


252 


BL00152 


ATP synthase a^ha and beta subunits 
proteins. 


BL00152E 22.68 l.OOOe-32 285-323 
BL00152A 15.38 5.154e-21 134-160 
BL00152C 11.41 6.250e-12 247-259 


253 


BIJ00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 2.200e-ll 54-63 


253 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 2.029e-09 35-74 


254 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 9.739e-12 417-451 


254 


PR00417 


PROKARYOTICDNA 
TOPOISOMERASE I SIGNATURE 


PR00417A 12.66 8.472e-09 65-79 


255 


B)j01052 


Calponin family repeat proteins. 


BL01052C 18.51 l.OOOe-40 88-128 
BL01052A 16.12 2.875e-35 3-35 BL01052B 
1531 5.219e-26 52-78 


255 


PR00888 


SMOOTH MUSCLE 
PROTEIN/CALPONIN FAMILY 
SIGNATURE 


PR00888D 16.09 9.112e-19 89-106 
PR00888E 11.81 2.800e-18 105-121 
PR00888F 7.44 4.600e-18 126-141 
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SEQID 
NO: 


Database 
entry ID 


Description 


^Results 








PR00888A n.87 7.750e-18 3-18 PR00888C 
12.27 2.28oe-17 52-68 PR00888G 12.73 
9.438e-15 Io3-'177 FR00888B 13.72 1.321e- 
14 22-36 


255 


PR00890 


SMOOTH MUSCLE PROTEIN 22- 
ALPHA (TRANSGEUN) 
SIGNATURE 


PR00890E 14.34 1.429e.27 136-156 
PR00890A8.61 l.OOOe-26 34-54 PR00890C 
8.22 1.600e-19 85-99 PR00890B 8.75 
6.318e-19 62-78 PR00890F 12.92 L205e-17 
162-175 PRU0890D 16.17 L130e-13 119- 
129 


257 


BIJ00745 


Prokaiyotic-type class I peptide chain 
release ractors signat 


BL00745C 13.66 l.OOOe-40 202-249 
BL00745B 22.56 8.683e-33 148-191 
BL00745D 14.90 8.435e.23 280-303 


259 


BL00194 


Thioredoxin family proteins. 


BL00194 12.16 7.429e-10 684-697 


260 


BLO0612 


Osteonectin domain proteins. 


BL00612E 13.12 3.948e-10 39M36 


260 


BL00484 


Thyroglobulin type-1 repeat protems 
proteins. 


BL00484C 17.01 8.244e-ll 136-151 
BL00484B 9.04 2.145e-10 249-263 
BL004o4U 17.U1 2.i0ye-U9 2o9-2o4 
BL00484B 9.04 8.950e-09 116-130 




JrKUUXo/ 


UiNAJ I'KUliillN rAMLLY 

SIGNATURE 


'DT>AA10'7A 1*) 0>t O OTCa AA OOO OAO 

rKUUIo / A 12.o4 /.J/^e-UV 2oo-3U5 


262 


BL00198 


Nt-dnaJ domain proteins. 


BL00198A 8.07 3.681e-<)9 292-309 


zoz 




Aminotransferases class-V pyridoxal- 
phosphate attachment site proteins. 


DT AAl C7 All O '^AAm. AA 1 ^ 1^ 

BL0ul57A 11.72 8.200C-09 16-26 






SIGNATURE 


'DPAAaOAH 11 10 O HCa AO OAI m 

rKUUizUB 12. ly Z.XZ^e-Uy 2U7-222 






Trypanosome variant sur&ce 
glycoprotein. 


PITAAAl ^ A ^1 1 CAAa AA iCiCii; iCT^ 

rrUWlJA /*5d Z.DUUe-W 000-07J 


266 


BL0n44 


Ribosomal protein L3 le proteins. 


BLOl 144 25.07 1 .OOOe-40 21-73 


zoo 




loo UlbCUlUlN 1 M-lbKMlNAL. 


UMUU31D iU.5i o.looe-13 133-190 


268 


B]j00132 


Zinc carboxypeptidases, zinc-binding 
region 1 proteins. 


BL00132C 21.35 7.863e-10 307-348 
BLOUl JZA Zo.07 B.yoJse-lO 224-265 


268 


PR00765 


CARBOXYPEPTIDASE A 
METALLOPROTEASE (M14) 
FAMILY SIGNATURE 


PR00765B 15.57 7. 17le-12 276-291 
PI10O765D 14.16 1 .55 le-09 420434 


ZOo 


DT AAtTA 


Cyclophilin-type peptidyl-prolyl cis- 
trans isomerase signatur. 


or AAfTA A 1 ^ AO A A1 O^ AA AOe ft 1 

BLOOwOA 17.08 9.018e-09 485-512 


269 


B1jO0622 


T^nrterisil TV^oiilsitnrv nmtf^inc IiiyD 

JJOvldAcU ICKUJUALviy UlvllCluaj lUAA. 

family proteins. 




270 


PR00048 


C2H2-TyPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 l.OOOe-ll 447-461 
PR00048A 10.52 4.316e-ll 389-403 
PR00048A 10.52 6.684e-ll 362-376 


270 


PF00651 


BTB (also known as BR-C/Ttk) domain 
proteins. 


PF00651 15.00 3.143e-10 37-50 


270 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 7.000e-10 392-409 BL00028 
16.07 9.100e-10 256-273 B1j00028 16.07 
2.286e-09 450-467 BL00028 16.07 8.714e- 
09 365-382 


274 


DM00303 


6 LEA 1 UMER REPEAT REPEAT. 


DM00303A 13.20 3.310e-O9 467-517 


275 


PF00622 


Domain in SPla and the RYanodine 


PF00622B 21.00 9.357e-14 374-396 



208 



wo 02/081731 



PCT/US02/01222 



Tables 
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NO: 


Database 
entry ID 


Description 


"^Results 






Receptor. 


PF00622C 12.62 1.857e42 458-472 


275 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 8.800e-ll 44-53 


217 


PF00651 


BTB (also known as BR-C/Ttk) domain 
proteins. 


PFOOoSl 15.00 9. 133e-10 65-78 


278 


PD00066 


PROTEIN ZINC-HNGER METAL- 
BINDL 


PD00066 13.92 8.200e-16 295-308 PD00066 
13.92 8.200e-16 519^532 PD00066 13.92 
1.692e-15 351-364 PD00066 13.92 4.462e- 
15 547-122 PD00066 13.92 4.600e-14 323- 
336 PD00066 13.92 4.600e-14 435-448 
PD00066 13.92 7.000e-14 463^76 PD00066 
13.92 1.500e-13 239-252 PD00066 13.92 
3.143e-12 267-280 ^000066 13.92 3.143e- 
12 407-420 PD00066 13.92 8.826e-ll 211- 
224 PD00066 13.92 2.038e-10 491-504 
PD00066 13.92 2.385e-10 379-392 


278 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7. 750e- 16 444-458 
PR00048A 10.52 6.727e-15 360-374 
PR00048A 10.52 9.182e-15 528-542 
PR00048A 10.52 7.000e-14 472^86 
PR00048A 10.52 7.750e-14 388-402 
PR00048A 10.52 l.OOOe-13 332-346 
PR00048A 10.52 3.133e-13 304-318 
PR00048A 10.52 4.857e-13 118-132 
PR00048A 10.52 6.786e-13 500-514 
PR00048B 6.02 l.OOOe-12 292-302 
PR00048A 10.52 8.941e-12 192-206 
PR00048B 6.02 l.OOOe-11 348-358 
PR00048A 10.52 1.947e-ll 248-262 
PR00048B 6.02 2.385e-ll 264-274 
PR00048B 6.02 7.23 le- 11 544-116 
PR00048A 10.52 7.632e-ll 416-430 
PR00048B 6.02 8.615e.ll 236-246 
PR00048B 6.02 2.688e-10 516-526 
PR00048B 6.02 4.375e-10 460470 
rKUUU4ox5 O.uZ 4,3 /j6-iU 4oo-4yo 

PR00048B 6.02 4.938e-10 404414 
PR00048B 6 02 6 063e-10 320-330 
PR00048A 10.52 7.2146-10 220-234 
PR00048B 6.02 1.947e-09 432-442 
PR00048B 6.02 4.3166-09 572-144 


278 


DM01970 


0kwZK632.12YDR313C 
ENDOSOMALm. 


DM01970B 8.60 5.012e-09 191-204 


279 


PD00066 


PROTEIN ZINC-FINGER METAI^ 
BDSfDI. 


PD00066 13.92 6.4006-16449-462 PD00066 
13.92 6.538e-15 504-517 PD00066 13.92 
9.308e-1542M34 PD00066 13.92 7.000e- 
14 476-489 PD00066 13.92 6.087e-ll 393- 
406 


279 


BL00028 


Zinc finger, C2H2 tfpt^ domain 
proteins. 


BL00028 16.07 2,500e-17 350-367 BL00028 
16.07 5.050e-13 405-422 BL00028 16.07 
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Database 
entry EO 


Description 


^Results 








9. 171e-12 433-450 BL00028 16.07 2.731e- 
11 








488-505 BL00028 16.07 3.077e-ll 516-533 
i>jLAJuu/o lo.u/ o.iuue-iu J / 


279 


PD02462 


PROTEIN ROT A TR A>J9niTPTim>J 
REGULATION AC. 


rJJUz40zA zZ.4o 0.4ooe-Uy 4ol--)lo 


279 


PR0004S 


C^H^.TVPF 7TMP PTMnTJP 
SI(»7ATURE 


rR0U04oA 10.52 3.250e-16 347-361 
PR00048B 6.02 5.154e-ll 501-511 
PR00048B 6.02 l.OOOe-lO 446-456 
PR00048A 10.52 1.3916-10513-527 
PR00048A 10.52 2.565e-10 485^99 
PR00048A 10.52 5.696e-10 402416 
PR00048B 6.02 8.875e-10 418-428 
PR00048A 10.52 1.720e-09 430-444 
PR00048B 6.02 3.368e-09 390400 

'DT>AAAylOA 1A CO O 'M\f\^ Aft I'^A 'itiO 

rKUUU4oA 1U.!>2 o.200e-U9 374-388 


285 


BL00276 


Channel foiming colicins proteins. 


BL00276A 8.87 6.500e-09 257-269 


286 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.000e.30 10-49 


286 


PD00066 


PROTEIN ZDSrC-FINGER METAL- 
BINDL 


PD00066 13.92 6.400e-16 388-401 PD00066 
13.92 3.769e-15 248-261 PD00066 13.92 
9.308e-15 304-317 PD00066 13.92 2.200e- 
14 360-373 PD00066 13.92 2.200e-14 416- 
429 PD00066 13.92 6.400e-14 332-345 . 
PD00066 13.92 l.OOOe-13 220-233 PD00066 
1 ^ »yZ 2. juUe-l D J y2-2U5 i'DUUUoo 1 3 .92 
5.000e-13 276-289 PD00066 13.92 5.500e- 
09 136-149 


286 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2.286e-16 260-277 BL00028 
16.07 2.588e-14 288-305 BL00028 16.07 
2.800e-13 400-417 BL00028 16.07 6.850e- 
13 120-137 BL00028 16.07 3.423e-ll 148- 
165 BL00028 16.07 7.923e-ll 344-361 
BL00028 16.07 2.500e-10 204-221 BL00028 
16.07 2.500e-10 428-445 BL00028 16.07 
3.100e-10 316-333 BL00028 16.07 6.1 OOe- 
10 176-193 BL00028 16.07 1.771e-09 232- 
249 BL00028 16.07 8.200e-09 372-389 


286 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.000e-17 257-271 
PR00048A 10.52 6.727e.l5 397-411 
PR00048A 10.52 2.929e-13 285-299 
PR00048A 10.52 9.471e-12 369-383 
PR00048B 6.02 l.OOOe-ll 329-339 
PR00048A 10.52 1.474e-ll 313-327 
PR00048A 10.52 2.421e-li 425439 
PR00048B 6.02 3.077e-ll 385-395 
PR00048A 10.52 6.6846-11 117-131 
PR00048A 10.52 8.141e-ll 201-215 
PR00048A 10.52 1.783e-10 341-355 
PR00048B 6.02 2.125e-10 301-311 
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NO: 
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entry ID 


Descripbon 


*Kesuiis 








PR00048B 6.02 2.125e-10 357-367 
PR00048B 6.02 2.688e-10 217-227 
PR00048A 10.52 3.739e-10 229-243 
PR00048B 6.02 4.938e-10 273-283 
PR00048B 6.02 1.474e.09 245-255 
PR00048A 10.52 2.440e-09 145-159 
PR00048B 6.02 3.842e-09 161-171 
PR00048B 6.02 8.105e-09 44M51 
PR00048B 6.02 9.053e-09 189-199 


287 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.407e-23 3-42 


287 


BL00028 


Zinc finger, C2H2 ^e, domain 
proteins. 


BL00028 16.07 8.94 le-1 4 269-286 BL00028 
16.07 l.OOOe-13 549-128 BL00028 16.07 
2.565e-12 194-650 BL00028 16.07 6.087e- 
12 241-258 BL00028 16.07 6.870e-12 297- 
314 BL00028 16.07 6.870e-12 381-398 
BL00028 16.07 7,214e-12 493-510 BL00028 
16.07 1.346e-ll 465^82 BL00028 16.07 
1.692e-ll 353-370 BL00028 16.07 3.769e- 
11 325-342 BL00028 16.07 6.192e-ll 167- 
622 BL00028 16.07 8.962e-ll 213-230 
BL00028 16.07 1.600e-l 0 409-426 BL00028 
16.07 5.200e-10 185-202 BIj00028 16.07 
6.7006-10 577-156 BL00028 16.07 3.057e- 
09 521-538 BL00028 16.076,143e-O9 437- 
454 


287 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 9.250e-14 238-252 
PR00048A 10.52 3.209e-12 266-280 
PR00048A 10,52 4.706e-12 490-504 
PR00048A 10.52 5.765e-12 462-476 
PR00048A 10.52 7.882e-12 630-644 
PR00048A 10.52 8.941e-12 518-532 
PR00048A 10.52 9.471e-12 164^178 
PR00048A 10.52 5.737e.ll 378-392 
PR00048A 10.52 7.158e-ll 546-122 
PR00048B 6.02 7.231e-ll 180-190 
PR00048A 10.52 8.141e.ll 210-224 
rKUUU4oA y.ua je-i i zy'i-juo 
PR00048A 10.52 9.053e-l 1 406-420 
PR00048A 10.52 3.348e-10 322-336 
PR00048B 6.02 3.813e-10 338-348 
PR00048B 6.02 3.813e-10 394-404 
PR00048B 6.02 3.8 13e-10 478-488 
PR00048B 6.02 4.938e-10 506-516 
PR00048A 10.52 8.043e-10 434-448 
PR00048B 6.02 8.875e-10 226-236 
PR00048B 6.02 8.875e-10 450-460 
PR00048B 6.02 l.OOOe-09 366-376 
PR00048B 6.02 l.OOOe-09 422-432 
PR00048A 10.52 3.520e-09 136-588 
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Description 
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PR00048B 6.02 7.158e-09 590-600 
PR00048B 6.02 7.632e-09 310-320 
PR00048B 6 02 7 632e-09 124-572 
PR00048A 10.52 9.2806-09 350-364 


289 


PR00070 


DIHYDROFOLATE REDUCTASE 
SIGNATURE 


PR00070C 13 09 6 143e-16 51-63 
PR00070D 11.63 2.929e-15 112-127 


289 




T^iHvHirifrfclsite rpHiirtaise nmtfiins 


BL00075A 27 70 7 900e-16 8-39 BL00075B 
13 49 3 813e-15 51-63 BL00075C 8 51 
2 862e-ll 66-79 BIj00075D 5 74 8 105e-10 
113-123 


292 


PR00250 


FUNGAL PHEROMONE MATING 
FACTOR STE2 GPCR SIGNATURE 


PR00250D 14.62 9. 163e-09 254-278 


294 


PR00081 


GLUCOSE/RffirrOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081A 10.53 2.731e.09 39-57 


294 


PR00080 


ALCOHOL DEHYDROGENASE 
SUPERFAMILY SIGNATURE 


PR00080C 17.16 6.464e-ll 191-211 

PR00080A 9.32 9.750e-09 118-130 


295 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4.28 8.920e-09 276-290 
PR008O6B 4.28 9.202e-09 275-289 


296 


PF00992 


Troponin. 


PF00992A 16.67 3.789e>10 553-588 


296 


BL00752 


XPA protein. 


BL00752B 19,17 8.144e-09 130-612 


296 


BL01160 


iCinesin light chain repeat proteins. 


BLOl 160B 19.54 8.551e-09 536-590 


298 


PR00511 


TEKTIN SIGNATURE 


PR0051 IC 7.86 4.214e-09 371-388 


300 


BL00353 


HMGl/2 Tsroteins 


BL00353B 11 47 9 171e-19 228-278 

Xi'X^Wa/a/./Xi' X X f X 1 XW X^ fc*^ W St f V 


301 


PR00240 


ALPHA-1 A ADRENERGIC 
RECEPTOR SIGNATURE 


PR00240C 8.38 3.941e-10 316-336 


302 


BL00S18 


Zinc finger, C3HC4 lype (RING 
finger), proteins. 


BL00518 12.23 2.200e-ll 54-63 


302 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 2.029e-09 35-74 




PR001Q3 


MYOSIN HF AVY CHAIN 
SIGNATURE 


PRfl0193D 14 36 1 545e-31 390-419 
PR00193C 12.60 1.209e-25 143-171 
PR00193B 11.692.5436-2495-121 
PR00193A 15.41 6.885e-19 39-59 
PR00193E 19.47 3.291e-12 444-473 






ATP-binding region A proteins. 


RT/)nA75A 24 K6 3 475e.09 9X.142 


306 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 5.920e-ll 47-59 


306 


PD00066 


PROTIEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 7.9236-15 140-153 PD00066 
13.924.0006-14 112-125 PD00066 13.92 
1.391e-ll 84-97 PD00066 13.92 1.692e-10 
168-181 


306 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2.059e-I4 96-1 13 BL00028 
16.07 4.1306-12 124-141 BL00028 16.07 
2.385e-ll 68-85 BL00028 16.07 8.2696-11 
180-197 BL00028 16.07 8.962e-ll 152-169 
BL0D028 16.07 9.400e-10 319-336 


306 


PR00799 


ASPARTATE 

AMINOIRANSFERASE 

SIGNATURE 


PR00799D 16.465.1256-09 188-214 
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ties lllla 


306 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048B 6.02 1.900e-13 81-91 PR00048A 
10.52 3.133e-13 65-79 PR00048A 10.52 
9357e-13 121-135 PR00048A 10.52 9.357e- 

147 PR00048A 10.52 4.522e-10 279-293 

PR00048B 6.02 9.438e-I0 109-119 

PR00048B 6,02 8.105e-09 165-175 


307 


PD00015 


GLYCOPROTEIN PRECURSOR 
CELLS!. 


PD00015A 8.90 6.400e-09 35-43 


310 


DM00031 


IMMUNOGLOBULIN V REGION, 


DM0003 IB 15.41 3.662e-ll 80-114 


311 


BL00824 


Elongation &ctor 1 beta/betaVdelta 
chain proteins. 


BL00824C 14.58 l.OOOe-40 129-167 
DiJjKjoJAu 14.04 o.iy/e-jy lo/-zuz 
BL00824B 9.21 2.0806-21 96-116 
BL00824E 12.49 3.333e-19 210-226 


312 


PR00501 


KELCH REPEAT SIGNATURE 


PR00501B 18.88 7.632e-09 476-491 
PR00501B 18.88 9.763e-09 523-538 


313 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.200e-30 43-82 


313 


PD00066 


PROTEIN ZINC-FINGER METAL-. 
BINDI 


PD00066 13.92 6.500e-13 439-452 PD00066 
13.92 8.000e-13 355-368 PDOOO60 13.92 
l.OOOe-12 383-396 PD00066 13.92 4.000e- 
12 327-340 PD00066 13.92 5.714e.l2 411- 
424 PD00066 13.92 8.435e-ll 299-31213.92 
5.800e-14 467-480 PD0O066 


313 


BIJ00028 


Zinc finger, C2H2 type, domam 
proteins. 


BL00028 16.07 2.565e-12 451-468 BL00028 
16.07 2.957e-12 311-328 BL00028 16.07 
3.348e-12 367-384 BL00028 16.07 1.692e- 
11 423-440 BL00028 16.07 2.731e-l 1 283- 
300 BL00028 16.07 2.800e-10 339-356 
BIj00028 16.07 9.700e-10 199-216 BL0OQ28 
16.07 l.OOOe-09 395-412 BIj00028 16.07 

A c\Qfit>. no 1 on 1 •5*7 
4.Uooe-uy i 


313 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 5.909e-15 364-378 
PR00048A 10.52 2.286e-13 308-322 

PR00048A 10 52 6 824e-12 448-462 
PR00048A 10.52 2.421e-ll 196-210 
PR00048A 10.52 l.OOOe-10 280-294 
PR00048B 6.02 3.813e-10 324-334 
PR00048B 6.02 4.375e-10 464474 
PR00048A 10.52 6.870e-10 336-350 
PR00048A 10.52 7.214e-10 420-434 
PR00048B 6.02 7.750e-10 436-446 
PR00048B 6.02 4.316e-09 380-390 


314 


PR00121 


SODIUM/POTASSnJM- 
TRANSPORTING ATPASE 
SIGNATURE 


PR00121D 16.72 1.577e-13 210-232 


314 


PR00119 


P-TYPE CATION-TRANSPORTING 


PR00119B 13.94 9. 194e-12 217-232 
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^Ej\1 UJ 

NO: 


entry no 


Description 


^Results 






ATP A QT TPPP PA Vf TF V 

SIGNATURE 




314 


BL01228 


Hypothetical cof family proteins. 


BL01228D 17.44 3.400e-ll 646-671 




nr AA1 


ci^iLZ A 1 rases pnospnoiylation site 

proteins. 


B]j00154E 20.37 4.054e-13 486-527 
BIJ00154C 12.38 4:060e-12 213-232 
BL00154F 8.23 9.597e-ll 207-669 


315 


BL00888 


Cyclic nucleotide-binding domain 
proteins. 


BL00888B 14.79 1.692e-10 396-420 


315 


BL00420 


Speract receptor repeat proteins donoain 
proteins. 


BL00420A 20.42 8.338e-09 215-682 


315 


DM00668 


ZEIN. 


DM00668A 10.20 8.500e-09 155-170 


316 


PR00727 


BACTERIAL LEADER PEPTIDASE 1 
(S26) FAMILY SIGNATURE 


PR00727C 13.04 9.063e-16 108-128 
PR00727B 12.51 7.848e-ll 81-94 


316 


BLX)0501 


Signal peptidases I serine proteins. 


BL00501D 16.69 2.884e-13 108-128 
BL00501C 9.61 9.561e-ll 81-93 BL00501B 
12.58 7.000e-09 61-77 


317 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 9.471e-27 13-52 


317 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 5.235e-14 214-231 BL00028 
16.07 6.850e-13 270-287 BL00028 16.07 
9.100e-13 354-371 BL00028 16.07 1.391e- 
12 158-175 BL00028 16.07 1.346e-ll 298- 
315 BL00028 16.07 3.769e-Il 242-259 
BL00028 16.07 6.538e-ll 380-397 BL00028 
16.07 8.800e-10 186-203 BL00028 16.07 
1.514e-09 326-343 


317 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048B 6.02 3.000e-12 199-209 
PR00048A 10.52 7.882e-12 351-365 
PR00048A 10.52 8.412e-12 323-337 
PR00048A 10.52 8.941e-12 239-253 
PR00048A 10.52 1.474e-ll 211-225 
PR00048A 10.52 6.211e-ll 155-169 
PR00048B 6.02 7.231e-ll 311-321 
PR00048A 10.52 8.141e-ll 267-281 
PR00048B 6.02 3^50e-10 339-349 
PR00048B 6.02 3.813e-10 255-265 
PR00048B 6.02 7.188e-10 283-293 
PR0004SB 6 02 3 9A1tJN^ 17U1 R1 
PR00048B 6.02 3.842e-09 393-403 
PR00048A 10.52 8.200e-09 295-309 


319 


PR00004 


ANAPHYLATOXIN DOMAIN 
SIGNATURE 


PR00004C 12.46 8.141e-09 91-103 


320 


DM00060 


338 kwNEUREXIN ALPHA HI 
CYSTEINE. 


DM00060 6.92 6.500e-ll 28-38 


320 


PROOOlO 


TYPE n EGF-LKE SIGNATURE 


PROOOIOC 11.16 7.667e-ll 44-55 


325 


PR00Q2O 


MAM DOMAIN SIGNATURE 


PR00020A 18.17 5.776e-12 344-363 
PR00020C 13.66 6.932e-10 417-429 


325 


BL00740 


MAM domain proteins. 


BL00740A 13.87 8.313e-12 346-359 
BL00740B 19.76 8.500e-09 486-507 


325 


PD02080 


T-CELL GLYCOPROTEIN CDS 


PD02080B 20.69 9.621e.09 123-162 
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CHAIN SURFACE ALPHA PRE. 




326 


BIJ00048 


Protamine PI proteins. 


BL00048 6.39 6.128e-10 167-194 


326 


PF01140 


Matrix protein (MA), pi 5. 


PF01140D 15.54 9.791e-09 220-255 


327 


PR00020 


MAM DOMAIN SI(9<IATUR£ 


PR00020C 13.66 2.615e-ll 143-593 
PR00020B 15.52 5.059e40 52-69 
PRO0020B 15.52 1.789e-09 553-132 


329 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 9.357e-32 8-47 


329 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 

s 


BL00028 16,07 3.209e-14 284-301 BL00028 
16.07 4.600e-13 508-525 BL00028 16.07 
6.400e-13 368-385 BL00028 16.07 4.1 15e- 
11 396-413 BL00028 16.07 4.1 15e-ll 424- 
441 BL00028 16.07 8.269e-ll 172-189 
BL00028 16.07 8.962e-ll 256-273 BL00028 
16.07 9,308e-ll 312-329 BL00028 16.07 
9.654e-ll 200-217 BL00028 16.07 3.100e- 
10 340-357 BL00028 16.07 5.500e-l 0452- 
469 BL00028 16.07 9.100e-I0 480^97 
BL00028 16.07 4.086e-09 22B-24S 


329 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 7.000e-14 272-285 PD00066 
13.92 5.000e-13 328-341 PD00066 13.92 
5.500e-13 188-201 PD00066 13.92 5.500e- 
13 384-397 PDOOOoo 13.92 o.OOOe-13 496- 
509 PD00066 13.92 6.1436-12468-481 
PD00066 13.92 2.731e-10 440-453 PD00066 
13.924.8086-10 160-173 PD00066 13.92 
5.5006-10 244-257 PD00066 13.92 7.0006- 
09 216-229 PD00066 13.92 7.000e-09 412- 
425 


332 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR. 


PD02870B 18.83 5.8716-11 468-501 


332 


PR00019 


UEUCINE-RICH REPEAT 

SIGNATURE 


PR00019A 11.19 8.0436-10275-289 


332 


BL00240 


Receptor tyrosine kinase class m 
proteins. 


BL00240B 24.70 4.447e-09 430-454 


333 


BL00738 


S-adenosyl-L-homocysteine hydrolase 
proteins. 


BL00738J 18.61 l.OOOe-40 154-204 

BL0U738H 23.08 5.320e-3o 4o8-521 

BL00738A 16.27 9.6606-27 216-256 
BL00738C 16.53 7.923e-25 281-319 
BL00738G 14.29 6.268e-23 446-468 
BL00738B 12.28 8.0856-21 256-281 
B1J00738E 14.18 9.200e-19 361-384 
BL007381 14.57 5.135e-17 545-583 
BL00738D 7.16 5.109e-13 335-350 


333 


BIj00836 


Alanine dehydrogenase & pyridine 
nucleotide transbydrogenase. 


BL00836D 22.30 8.622e-Q9 424-461 


337 


PR00425 


BRADYKININ RECEPTOR 
SIGNATURE 


PR00425C 13.23 3.148e-09 80-100 


342 


PD01823 


PROTEIN INTERGENIC REGION 


PD01823E 9.30 6.824e-12 108-121 
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^jiesuiis 






MITOCHONDRION T. 


rJJUloZ^L/ lO.OO l.ZO^e-U7 40-0/ 




rKOUy /O 


FAMILY SIGNATURE. 


ppnno'7#co 1 A 41 0 fi)'7n no Q0#i_An7 

rKUUy /0\^ iU.41 Z.oJ /e-UV J70-*fU / 


343 


DMUUzlD 


rKULlNii-KJCH rKUlblN J. 


T^AifAAOl CIO A1 1 XCOa AO 

JDMUU21D iy.4i 1.4joe-Uy 4/J-jUO 
DM00215 19.43 4,814e.09 463-496 


343 


PR00671 


INHIBIN BETA B CHAIN 
oICjNAI UKJb 


PR00671C 4.18 9.1726-09 707-727 


343 


PD01234 


PROTEIN NUCLEAR 
BROMODOMAIN TRANS. 


PD01234B 15.53 l.OOOe-08 482-500 


344 


PR00175 


MYOGLOBIN SIGNATURE 


PR00175B 9.02 2.143e-10 25-49 


344 


PR00814 


BETA HAEMOGLOBIN 
SIGNATURE 


PR00814C 9.20 o.523e-10 66-84 


344 


PR00173 


ERYTHROCRUORIN FAMILY 

SIGNATURE 


PR00173A 15.91 7.158e-1025-48 


344 


BL01033 


Globins profile. 


BL01033A 16.94 l.OOOe-16 25^7 
BL01033B 13.81 8.615e-09 87-99 


344 


PR00612 


ALPHA HAEMOGLOBIN 
SIGNATURE 


PR00612E 9.04 4.194e-12 122-139 
PR00612B 10.92 3.483e-10 32^3 
PR00612D 9.76 9.438e-09 74-88 


345 


PR00814 


BETA HAEMOGLOBIN 
SIGNATURE 


PR00814C 920 6^23e-10 104-122 


345 


BLOIOSS 


Globins profile. 


BL01033A 16.94 5.125e-10 63-85 
BL01033B 13.81 8.615e-09 125-137 


345 


PR00612 


ALPHA HAEMOGLOBIN 
SIGNATURE 


PR00612E 9.04 4.194e-12 160-177 
PR00612B 10.92 3.483e-10 70-81 
PR00612D 9.76 9.438(^)9 1 12-126 


349 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.133e-32 645 


350 


B1j00972 


Ubiquitin carboxyl-terminal hydrolases 
£amily 2 proteins. 


BL00972A 11.93 6.318e-19 364-382 . 
BL00972D 22.55 7.968e-16 210-673 
BL00972B 9.45 1.600e-12 445-455 


350 


PR00049 


WILM»S TUMOURPROTEIN 
SIGNATURE 


PR00049D 0.00 8.0O8e-13 121-136 
PR60049D 0.00 lyii^n 125-140 
PR00049D 0.005.916e-ll 128-143 

PR00049D 0.00 6.748e-ll 122-137 
PR00049D 0.00 9,395e-l 1 126-141 

PR00049D 0.00 8.929e-10 127-142 
PR00049D 0.00 2.6786-09 129-144 
PR00049D 0.00 4.051e-09 123-138 
PR00049D 0,00 4.051e-09 124-139 
PR00049D 0.004.051e-09 130-145 


350 


PR00211 


GLUTELIN SIGNATURE 


PR00211B 0.86 7.5O0e-O9 124-145 


350 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 5.339e-10 108-141 
DM00215 19.43 7.268e-10 112-145 
DM00215 19.43 2.525e-09 106-139 
DM00215 19.43 9.695e-09 107-140 


350 


BL00048 


Protamine PI proteins. 


BL00048 6.39 9.888e-09 145-172 


352 


BL00518 


Zinc finger, C3HC4 type (RING 


BL00518 12.23 4.429e- 10 214-223 
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finger), proteins. 




353 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 4.429e-10 179-188 


354 


BL01009 


Extracellular proteins SCP/Tpx- 
1/Ag5/PR-1/Sc7 proteins. 


BL01009D 14.19 9.341e-17 160-181 
BL01009A 13.75 3.769e.l4 80-98 
BL01009E 13.50 5.333e-14 194-210 
BL01009C 10.54 2.667e-ll 127-141 


354 


PR00838 


VENOM ALLERGEN 5 SIGNATURE 


PR00838G 16.07 2.304e-14 158-178 
PR00838D 8.73 4.452e-12 80-99 PRO0838F 
10.11 7.532e-10 125-141 


354 


PR00837 


ALLERGEN V5/rPX-l FAMILY 
SIGNATURE 


PR00837C 17.21 7.429e-18 159-176 
PR00837A 14.77 1.900e-15 80-99 
PR00837D 11.12 2.198e-13 195-209 
PR00837B 11.64 3.483e-09 127-141 


356 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 8.500e-17 16-41 
BL00215B 10.44 4.900e-09 177-190 
BL00215A 15,82 6.786e-09 133-158 
BL00215B 10.44 7.300e-09 278-291 


356 


PR00926 


MITOCHONDRIAL CARRIER 
PROIEIN SIGNATURE 


PR00926E 11.70 6.049e-13 91-1 10 
PR00926F 17.75 7.600e-ll 240-263 
PR00926F 17.75 5.219e-10 18-41 PR00926D 
10.53 /.Dole-Oy 24o-2o5 


357 


PR00326 


GTPl/OBG GTP-BINDING PROTEIN 

trAXjTTT V CT/TXI A TT rot? 


PR00326A 8.75 7.150e-ll 21-42 


J J I 




Adenylate kinase proteins. 


rJLUUl ID A 12.74 o.o7/e-0y 22-39 


357 


BL01128 


Shildmate kinase proteins. 


BLOl 128A 18.84 7.802e-09 21-55 


357 


BL00300 


SRP54-type proteins GTP-binding 
domain proteins. 


BL00300B 20.56 l.OOOe-08 18-64 


O CO 

355 


BL00972 


Ubiquitin caiboxyl-terminal hydrolases 
fimily 2 proteins. 


BL00972A 11.93 6.318e-19 324-342 
BL00972D 22.55 3.903e-16 170-194 
BL00972B 9.45 L600e-12 405-415 


364 


DM00215 


PROLINE-RICH PROTEIN 3, 


DM00215 19.43 1.482e.l0 355-388 


364 


PR00217 


43 KD POSTSYNAPTIC PROTEIN 
SIGNATURE 


PR00217C 10.91 4.6006-10302-318 


365 


BL00518 


Zinc finger, C3HC4 type (RING 
nnger), proteins. 


BL00518 12.23 2.800e-ll 125-134 


365 


BL00415 


Synapsins proteins. 


BL00415N 4.29 2.839e-09 387-431 








jjMuuziD iy.*ij /./uoe-11 3//-410 
DM00215 19.43 8.412e-ll 333-366 
DM00215 19.43 2.678e-09 356-389 
DM00215 19.43 5.138e-09 376-409 


365 


BL01I02 


Prokaryotic dksA/traR C4-type zinc 
finger. 


BL0I102 15.99 5.705e-09 109-135 


365 


PR00211 


GLUTEUN SIGNATURE 


PR00211B 0.86 5.959e-ll 407-428 
PR00211B 0.86 2.212e-10 401-422 
PR0021 IB 0.86 9.500e-09 336-357 


365 


PR00049 


WDLM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D O.qO 9.695e-09 335-350 


367 


PD00126 


PROTEIN REPEAT DOMAIN TPR 
NUCLEA. 


PD00126A 22.53 8.448e-09 2-23 
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370 


BL00028 


Zinc finger, C2H2 type, domain 
proteins.. 


BL00028 16.07 7,353e-14 157-174 BL00028 
16.07 l.OOOe-13 269-286 BL00028 16.07 
8.200e-13 493-510 BL00028 16.07 3.739e- 
12 213-230 BL00028 16.07 6.478e-12 381- 
398 BL00028 16.07 1.346e-ll 185-202 
BL00028 16.07 2.385e-ll 129-146 BL00028 
16.07 2.385e-ll 325-342 BIj00028 16.07 
5.154e-l 1241-258 BL00028 16.07 9.654e- 
11 437-454 BL00028 16.07 1.300e-10 297- 
314 BL00028 16.07 9.1 OOe-1 0409-426 
BL00028 16.07 9.100e-10 465-482 


370 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 2.385e-15 229-242 PD00066 
13.92 3.0776-15 145-158 PD00066 13.92 
8.800e-14 173-186 PD00066 13.92 3.500e- 
13 369-382 PD00066 13,92 8.500e-13 341- 
354 PD00066 13.92 9. 133e-12 397-410 
PD00066 13.92 2.174e-ll 313-326 PD00066 
13.92 3.348e-ll 453-466 PD00066 13.92 
3.739e-ll 481-494 PD00066 13.92 7.214e- 
11257-270 PD00066 13.92 2.038e-10 425- 
438 PD00066 13.92 6.538e-10 201-214 
PD00066 13.92 5.200e-09 285-298 


370 


DM01970 


0kwZK632.12yDR313C 
ENDOSOMALin. 


DM01970B 8.60 6201e-09 265-278 


370 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 1.474e-ll 462-476 
PR00048A 10.52 6.684e-ll 182-196 
PR00048A 10.52 2.957e-10 434-448 
PR00048B 6.02 5.500e-10 338-348 
PR00048A 10.52 6.4786-10350-364 
PR00048B 6.02 6.187e-10 226-236 
PR00048A 10.52 6.870e-10 490-504 
PR00048A 10.52 8.826e-10 406-420 
PR00048B 6.02 3.842e-09 170-180 
PR0004oJd O.02 4.316e-09 365-376 
PR00048B 6.02 4.789e-09 478-488 
PR00048B 6.02 7.632e-09 142-152 
FR00048A 10.52 o.l22e-09 126-140 


371 


BL01019 


ADP-ribosylation faictors family 
proteins. 


BL01019B 19.49 6.276e-21 95-150 
BL01019A 13.20 8.453e-17 51-91 


371 


PR00328 


GTP-BINDING SARI PROTEIN 
SIGNATURE 


PR00328C 13.16 8.481e-13 78-104 
PR00328D 12.56 3.357e-ll 123-145 


371 


BL01115 


GTP-binding nuclear protein ran 
proteins. 


BL01115A 10.228.1196-11 21-65 


373 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 4.522e-12 208-225 


373 


PD00066 


PROTEIN ZINC-FINGER METAI^ 
BINDI. 


PD00066 13.92 7.000e-13 194-207 PD00066 
13.92 7.000e-13 224-237 PD00066 13.92 
7.000e-12 254-267 


373 


PR00048 


C2H2-TYPE ZINC FINGER 


PR00048A 10.52 1.391e-10 205-219 
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11 ik« AIM Vk^S A«« 

Ucscnpuon 


tcesuiis 






olvJlN A 1 UKC 


risSjVxJHoD O.Uz O.UOJe-lU Zzi-Zji 


J/4 




1 Yrc 1 ATM 1 JirKDJaz^fc rKUl cUN 

SIGNATURE 


DDAA'ZAQA C OA *? IQQa 11 KAQ 

rKUU3UoA /.zooe-l 1 j j304o 
PR00308A 5.90 8.835e-09 534-549 


3/1 




RTOONUCLEOPROTEIN. 


rLKl27ow 2o.4o 7.53oe-09 147-190 


378 


PD0l351 


PROTEIN REPEAT 
MbUKUrlLAMEN 1 IKJLPL. 


PD0135IA 8.69 7.469e-09 155-166 


380 


PF00094 


von Willebrand factor type D domain 
proteins. 


PF00094C 12.88 1.91 8e-09 43-53 


380 


BL01208 


VWFC domain proteins. 


BL01208B 15.83 3,667e-ll 120-135 
BL01208B 15.83 1.973e-09 178-193 


380 


PD02138 


PRECURSOR GLYCXDPROTEIN 
SIGNAL CELL. 


PD02138A 27.60 9.057e-0? 20-69 


381 


BL01105 


Ribosomal protein L35Ae proteins. 


BL01105B 12.95 7.930e-13 43-83 


384 


PR00049 


WILM'S TUMOURPROTEIN 
SIGNATURE 


PR00049D 0.00 9.205e.l0 10-25 PR00049D 
0.00 1.915e-09 9-24 


385 


BL01115 


GTP-binding nuclear protein ran 
proteins. 


BLOll 15A 10.22 8.909e-13 34-78 


385 


BL00905 


GTPl/OBG family proteins. 


BL00905D 15.00 5.313e-09 140-155 


385 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17^7 3.209e-19 75-98 
PR00449A 13.20 l.OOOe-17 34-56 
PR00449D 10.79 3.368e-13 139-153 
PR00449B 14.34 8.3o4e-ll 57-74 PR00449E 
13.50 8.286e-09 174-197 


JoD 




Eukaryotic RNA pol3anerase II 
heptapeptide repeat proteins. 


BLUUl loZ 3.12 l.yl/G'-iD 397-446 






CAMP KcorUNdEr ELbMcN 1 
SIGNATURE 


PROtMWlF 0.53 9.365e-09 256-274 






F-box domam protems. 


rrilUOHOA 14.3/ 7,U30e-iU ^o-4Z 


389 


BIJ00036 


bZIP transcription factors basic domain 
proteins. 


BIJ00036 9.02 6.294e-12 81-94 






rsJo lKAJNo^LIKJ^AlINO PKLIlclN 

SIGNATURE 


DD AAA/1 '^/^ Q OA Q IACa 110^ AO l>1>AAA>IOt^ 

rKUUU4zL/ ©.zV o.lUje-13 o2-9y rKUUU42D 
8.97 9.895e-10 100-122 






Clatbtin light chain proteins. 


UT AAOO>IT3 1 £ 0>l ^ ^"T^a AO *TA 1 

15LiNJ224i3 I0.74 3.3 /DCAJy 7U-123 


389 


PRD0043 


JUN TRANSCRIPTION FACTOR 


PR00043B 8.73 9396e-09 81-98 


390 


PF00622 


Domain in SPla and tiie RYanodine 
Receptor. 


PF00622B 21.00 2.500e-13 85-107 


391 


BL00564 


Argininosuccinate synthase proteins. 


BL00564A 19.93 6.114e-09 7-44 


392 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.750e-14 230-244 
PR00048A 10.52 4.316e-ll 202-216 


392 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2.125e-15 205-222 BL00028 
16.07 1.391e-12 233-250 BL00028 16.07 
3.400e-10 177-194 


392 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 3.000e-13 193-206 PD00066 
13.92 3.423e-10 221-234 


393 


PF00622 


Domain in SPla and the RYanodine 
Receptor. 


PF00622B 21.00 1.391e-16 132-154 
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393 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 8.800e.l0 761-778 BL00028 
16.07 2.029e-09 789-806 


393 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 2.800e-09 758-772 


394 


PR00501 


KELCH REPEAT SIGNATURE 


PR00501A 8.25 1.409e-09 537-551 


394 


DM00099 


4 kw A55R REDUCTASE 
TERMINAL. DIHYDROPTERIDINE 


DM00099B 14.73 4.375e-09 415-425 


395 


PR00399 


SYNAPTOTAGMIN SIGNATURE 


PR00399A 9.52 3.133e-19 146-162 
PR00399C 12.82 8.200e-17 222-238 
PR00399B 14.27 7.750e-16 161-175 
PR00399D 14.48 4.000e-14 242-253 


395 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 8.269e-13 201-215 
PR00360A 14.59 2,800e-12 174-187 
PR00360B 13.61 5.217e-12 340-354 
PR00360A 14.59 5.207e-10 311-324 


395 


PF00168 


C2 domain proteins. 


PF00168C 27.49 5.500e-18 323-349 
PF00168B 11.83 2.000e-09 306-317 


396 


BL01013 


Oxysterol-binding protein &nily 
j)roteins. 


BL01013A 25.14 7-231e-21 558-156 
BL01013B 11.33 l.OOOe-11 185-196 


396 


PF00791 


Domain present in ZO-1 and Unc5-like 
netrin receptors. 


PF00791B 28.49 3.534e-10 52-107 


396 


PD00078 


REPEAT PROTEIN ANK NUCLEAR 
ANKYR. 


PD00078B 13.14 9.000e-ll 173-186 
PD00078B 13.14 3.739e-09 78-91 
PD00078B 13.14 4.130e-09 45-58 


396 


PF00023 


Ank repeat proteins. 


PF00023B 14.20 3.077e-ll 48-58 PF00023B 
14.20 3.769e-ll 176-186 PF00023A 16.03 
7.429e-09 85-101 


397 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 1.750e-10 55-71 


397 


PF00791 


Domain present in ZO-1 and Unc5-like 
netrin receptors. 


PF00791B 28.49 4.455e-ll 55-110 
PF00791B 28.49 7.291e-10 88-143 


398 


BL00422 


Granins proteins. 


BL00422C 16.18 5.787e-10 134-162 


400 


PR00450 


RECOVERIN FAMILY SIGNATURE 


PR00450D 16.58 8.986e-ll 161-181 


400 


BL00479 


Phorbol esters / diacr/lglycerol binding 
domain proteins. 


BL00479B 12.57 4.273e-15 287-303 
BL00479A 19.86 2.667e-14 261-284 
BL00479B 12J7 1.360e-10 351-367 


400 


PR00171 


CLASS ffl CYTOCHROME C 
SIGNATURE 


PR00I71D 730 9.419e>10 334-342 


400 


BL00018 


£F-hand calcium-bmding domam 
proteins. 


BL00018 7.41 3.348e-09 223-236 


400 


PF00781 


Diacylglycerol kinase catalytic domain 
proteins (presumed). 


PF00781F 16.43 l.OOOe-40 600-199 
PF00781B 12.07 8.364e-35 454-486 
PF00781D 11.11 3.O77e-30 532-118 
PF00781C 9.69 5.034e-19 506-521 
PF00781E 12.45 2.385e-17 124-583 
PF00781G 10.09 6.21 le-17 678-692 
PF00781H 12.20 1.750e-16 770-782 
PF00781A 6.42 3.667e-09 354-360 


401 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.407e-09 325-340 


402 


DM01117 


2 kw TRANSPOSASE WITHIN 


DM01117A 11.17 7.750e-09 364-382 
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SEQID 

NO* 


Database 

AYltl^ 111 


Description 


^Results 






TRANSPOSITION VASOTOCIN. 








PROTEIN, 


DM01206B 10.69 3,466e-10 726-746 

DM01206B 10.69 7.152e-09 718-738 
DM01206B 10.69 8.861e-09 728-748 


403 


BL00048 


Protamine PI proteins. 


BL00048 6.39 4.197e-10 722-749 BL00048 
6.39 5.500e-10 731-758 BL00048 6.39 

730-757 BL00048 6.39 4.038e-09 728-755 
BLj00048 6.39 8.538e-09 724-751 BL00048 
6.39 9.438e-09 716-743 


403 


.PD00289 


PROTEIN SH3 DOMAIN REPEAT 
rKboYNA. 


PD00289 9.97 9,690e-09 130-144 


404 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAI^BINDING NU. 


PD01066 19.43 1.353e-27 31-70 


404 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 5.154e-15 274-287 PD00066 
13.92 7.600e-14 246-259 PD00066 13.92 
8.200e-14 302-315 PD00066 13.92 3.143e- 
12 218-231 PD00066 13.924.0006-12 190- 
203 PD00066 13.92 2.800e-09 330-343 


404 


cJUUUUZo 


Zinc linger, CZHz type, domain 
proteins. 


oi^UU^o lO.U/ /,ZOie-lz z3U-z4/ i3iXlUU2o 

16.07 9.171e-12 342-359 BL0OO28 16.07 
4.300e-10 314-331 BL00028 16.07 7.000e- 

10 174-191 BL00028 16.07 3.314e-09 202- 
219 BL00028 16.07 6.400e-09 286-303 


404 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 4.214e-13 339-353 
PR00048A 10.52 3.209e-12 227-241 

PR00048A 10.52 4.522e-10 171-185 
PR00048B 6.02 2.895e-09 299-309 
PR00048A 10.52 4.600e-09 199-213 
PR00048B 6.02 l.OOOe-08 187-197 
PR00048B 6.02 1.000e-08 271-281 


406 


B1J00610 


Sodiummeurotransmitter symporter 
family proteins. 


BL00610A 17.73 l.OOOe-40 68-118 

DLajUOIVd Zj.OD i*UUUe-HU l^Z-IoZ 

BL00610C 12.94 l.OOOe-40 225-277 
BT OOftlOD 20 97 1 OOOe-40 291-144 
BL00610F 29.02 6.143e-36 540-157 
BL00610E 20.34 3.209e-35 448^91 
BL00610G 12.89 2.200e-15 173-196 


406 


PR00176 


SODIUM/NEUROTRANSMTTTER 
SYMPORTER SIGNATURE 


PR00176C 10.84 6J226e-23 141-168 
PR00176A 16.82 1.450e-22 68-90 PR00176F 
10.73 8.667e-20 452-472 PR00176B7.31 
7.000e-18 97-117 PR00176D9.02 l.OOOe-17 
252-270 PR00176E 11.41 2.756e-15 334-355 
PR00176H 15.27 7.353e-15 131-590 
PR00176G 12.48 5.615e-14 529-112 


407 


DM00179 


w KINASE ALPHA ADHESION T- 
CELL. 


DM00179 13.97 5.304€-09 111-121 
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SEQID 
NO' 


Database 

pntrv in 


Description 


^Results 


408 


PR00187 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00187B 13.48 1.800e-1645-66 
PR00187A 12.84 6.700e-12 15-35 


408 • 


BIJ00198 


Nt-dnaJ domain proteins. 


BL00198B 15.11 9.2 17e-15 45-66 
BL00198A 8,07 2.459e-ll 19-36 


409 


PR00927 


ADENINE NUCLEOTIDE 
TRANSLCKJATOR 1 SIGNATURE 


PR00927E 14.93 4.136e-ll 246-268 


Ana 




x^iocnonunai energy nansier proieins. 


BL00215A 15.82 5.787e-ll 108-133 
BL00215B 10.44 6.211e-ll 258-271 
BL00215A 15.82 5.0186-09 211-236 


409 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926D 10.53 5 355e^ 19-38 


410 


PDO0D66 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 6.400e-17 411-424 PD00066 
13.92 8.200e-17 327-340 PD00066 13.92 








5,154e-15 271-284 PD00066 13.92 2.800e- 
14 215-228 PD00066 13.929.000e-13 355- 

PDO0O66 13.92 6.478e-ll 187-200 PD00066 
13.92 9.2 17e-l 1243-256 


410 


BIj00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2.588e-14 227-244 BL00028 
16.07 6.824e-14 395-412 BL00028 16.07 
7.882e-14 171-188 BL00028 16.07 2.350e- 
13 339-356 BL00028 16.07 7.300e-13 283- 
300 BL00028 16.07 7.300e-13 367-384 
BL00028 16.07 2.565e-12 423-440 BL00028 
16.07 7.261e-12 199-216 BL0C028 16,07 
7^61e-12 311-328 BIj00028 16.07 8.435e- 
12451-468 BIj00028 16.07 2.038e-ll 255- 
272 BL00028 16.07 9.4006-10 143-160 


410 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 3.250e-14 280-294 
PR00048A 10.52 8.S00e-14 336-350 

rKUUWoA IU.Oa /.^Z5'e-1 j Z^x-xOO 

PR00048A 10.52 8.714e-13 448^62 

PR00048A 10.52 1.0006-12 168-182 
PR nnrUR A 10 52 2 05Qe-12 420^*^4 
PR00048B 6 02 8 615e-ll 408-418 
PR00048B 6.02 7. 188e-l 0268-278 
PR00048B 6.02 7.188e-10 380-390 
PR00048B 6.02 9.438e-10 296-306 
PR00048B 6.02 l.OOOe-09 324-334 
PR00048B 6.02 1.474e^9 352-362 
PR00048B 6.02 3.842e-09 212-222 
PR00048B 6.02 5.2636-09 436-446 


411 


BIj00018 


£F-hand calciiun-binding domain 
proteins. 


BL00018 7.41 5.500e-10 63-76 


413 


PR00014 


FIBRONECTIN TYPE HI REPEAT 
SIGNATURE 


PR00014C 15.44 4.600e-10 73-92 


414 


PR00806 


VINCULIN SIGNATURE 


PR00806A 6.63 1.4936-09 785-796 


414 


PR00048 


C2H2-TyPE ZINC FINGER 


PR00048A 10.52 4.2406-09 41-55 
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SFO ID 
NO: 


entry ED 




^Results 






SIGNATURE 




414 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.5466-11 781-796 
PR00049D 0.00 1.205e40 263-278 
PR00049D 0.00 4.356e-09 785-800 






I^ClUOlllUUUiUI ^JI\x'*rJ) JJiVlLClIls* 


RT ftfUl 2D 1 6 54 4 673e-09 420-471 


414 


BIJ00422 


Granins proteins. 


BL00422C 16.18 6.318e-ll 439-467 
nfM9?r 1 ^ 1 R Q ROQp-lO 440.46R 

DLAfKfH^AS^ lO.lO j?.OV/7C-XV 'rf\r^\tO 

BL00422C 16.18 6.294e-09 441-469 
RT OfUl^r 1 6 1 6 209e-09 43S-466 


414 


PR00910 


LUTEOVmUS 0RP6 PROTEIN 


PR00910A2.51 8.1796-09 265-278 


414 


DM00215 


PROUNE-RICH PROTEIN 3. 


DM00215 19.43 4.203e-O9 770-803 


414 


BIj00Q28 


Zinc finger, C2H2 type, domain 
proicins. 


BL00028 16.07 1.2576-09 44-61 BL00028 

n7 0 ^A'^P-ftQ 17^-109 RT/)nn9R 1#5 07 

6.143e-09 119-136 BL00028 16.07 9.743e- 
no 147-1 fid 


415 


PF00622 


Receptor.Domain in SPla and the 

JV I oliULLillC 


PF00622B 21.00 l.OOOe-13 331-353 


415 


BLOOSIS 


Zinc finger, C3HC4 type (RING 
iiugcrjy pruiciuo. 


BL00518 12.23 3.4006-11 31-40 


416 


PF00780 


Domain found in NIKl *like kinases, 
mouse citron and yeast ROM. 


PF00780B 23.03 5.929e-33 442-485 


*T 1 VJ 


PR00109 


TYR055rMF KINASE CATALYTIC 
DOMAIN SIGNATURE 


PROOIOQB 12 27 5 235e-12 211-230 


416 


BL00107 


Protein Vina<jR5? ATP-HinHinp repinn 
Droteins 


BL00107A 18 39 5 200e-22 211-242 
BL00107B 13.31 9 308e-12 283-299 


416 


BIj00239 


Receptor tyrosine kinase class n 
proteins. 


BL00239B 25.15 5.164e-10 145-193 


416 


BIjOQ91S 


Phosphatidylinositol 3- and 44dnases 
proteins. 


BL00915C 22.43 9.3S7e-10 203-242 * 


417 


BL00021 


Kringle domain proteins. 


BL00021B 13.33 1.482e-14 41-59 
BL00021D 24.56 2.122e-12 193-235 


417 


PR00722 


CHYMOTRYPSIN SERINE 
PROTEASE FAMILY (SV) 
SIGNATURE 


PR00722A 12.27 7.5 1 7e-14 42-58 
PR0072% 12,51 3.143e-10 97-112 


417 


BL00134 


Serine proteases, trypsin family, 
histidine proteins. 


BL00134A 11.96 6.464e-16 41-58 
BL00134C 13.45 2.059e-09 221-235 


417 


BL00495 


Apple domain proteins. 


BL00495O 13.75 2.440e-09 212-241 


417 


B1j00672 


Serine proteases, V8 family, histidine 
proteins. 


BIJ00672A 9.79 9.520e-09 41-57 


417 


PR00839 


V8 SERINE PROTEASE FAMILY 
SIGNATURE 


PR00839B llJ09.753e-09 41-59 


418 


BL01207 


Glypicans proteins. 


BL01207B 23.69 9.122e-28 191-237 
BL01207A 12.21 l.OOOe-16 62-78 


423 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR. 


PD02870D 15.74 4.351e-09 693-728 


423 


DM00179 


w KINASE ALPHA ADHESION T- 
CELL. 


DM00179 13.97 5.696e-09 793-803 
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NO; 


Database 
entry ID 


vescripuoD 




424 


BL00203 


Vertebrate metallothionems proteins. 


BL00203 13.94 5.041e-09 13-59 


425 


BL00107 


Protein kinases ATP-binding region 
proteins. 


TiTOftlATA 1R ft 1A1«».1ft 917-^4R 


425 


BL00240 


Receptor tyrosine kinase class III 
proteins. 


BL00240E 11^6 6.040e-10 203-241 


425 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 5.814e-l4 21 /rZJo 
PR00109A 15.00 1.730e-09 182-196 


428 


PR00141 


PROTEASOME COMPONENT 
SIGNATURE 


PR00141C 11.15 6.333e- 12 234-240 
PR00141D 12.45 8,615e-12 259-271 
PR00141B 11.15 9.5ole-12 223-233 
PR00141A 11.36 2.050e-ll 102-118 


428 


BIj00854 


Proteasome B-type subunits proteins. 


BL00854A 33.93 L383e-19 99«'145 
BL00854C 29.92 5.235e-14 206-235 
BLOUoMD 1j.7o Z.oUUe-uy zD/-2o/ 


429 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 9.413e-17 59-81 
rRUU24DC /.o4 /.50Ue-l0 i:Jo-Zj4 
PR00245E 12.40 2.500e-12 291-306 

i^lvUUxHJlS X U.JO 7.1 1^-11 It f'iyl' 


429 


PR00237 


RHODOPSIN-LIKE GPCR 

bUrErKrAMJLLY MOINAiUKn 


PR00237E 13.03 7.120e-12 199-223 

PPnn777P 1^ M 1 77^*>-n0 104-177 


429 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 9.727e-14 90-130 
nr nn7i7r\ 1 1 7*^ i 77^j»_no 7ft7-700 


429 


PR00534 


MELANOCORTDSr RECEPTOR 
rAMii^r M<JiNAlUKJb 


PR00534A 11.49 6.400e-09 51-64 


430 


PF00651 


BTB (also known as BR-C/Tlk} domain 
proteuis. 


PF00651 15.00 l.OOOe-11 87-100 


430 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 4.706e-14 474^91 BL00028 
16.07 1.771e-09 502-519 


A1f\ 

430 


rJJUUUoo 


HINDI. 


X l.^UUUQO H.3UVV-U7 *T7v-JVJ 


430 


PR00048 


C2H2^TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 4.600e^ 499-^13 


433 


BL00086 


Cytochrome P450 cysteine heme-iron 
ligand proteins. 


BIJ00086 20^7 3.209e-23 430-462 


433 


PR00465 


E-CLASS P450 GROUP IV 
oluMAi UKc 


x'KUWODr ij.J / l.JOUe-il 4UU-417 


433 


PR00359 


B-CLASS P450 SIGNATURE 


PR00359G 11.22 8.071e-10 401-417 
FR00359F 24.20 2.180e4)9 373-401 


433 


PR0038S 


P450 SUPERFAMILY SIGNATURE 


PR00385E 12.66 8.800e-ll 440-452 
PR00385D 13.11 4.429e-10 431-441 
PR00385A 14.97 5.865e-09 302-320 


433 


PR00464 


E-CLASS P450 GROUP H 
SIGNATURE 


PR00464G 12.41 9.000e-10 405-421 
PR00464D 17.40 1.191e-09 320-338 
PR00464E 18.28 6546e-09 349-370 
PR00464H 13.32 7.750e-09 427-441 
PR00464C 18.84 9.014e-09 291-320 
PR004641 14.64 9.481e-09 440-464 


434 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 7.943e-19 101-151 


434 


PR00171 


SUGAR TRANSPORTER 
SIGNATURE 


PR00171D 12.76 3.593e-ll 413-435 
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entry ID 


Description 


♦Results 


435 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.429e-10 10-25 


435 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 4.150e-13 138-593 BL00028 
16.07 6.850e-13 1010-1027 BL00028 16.07 
6.087e-12 982-999 BL00028 16.07 8.615e- 
11 846-863 BL00028 16.07 3.100e-10 317- 
334 BL00028 16.07 7.000e.l0 170-187 
BL00028 16.07 8. 500e-10 289-306 B1j00028 
16.07 8,800e.l0 548-565 


435 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 7.600e-14 998-1011 
PD00066 13.92 l.OOOe-U 305-318 PD00066 
13.92 8.826e-ll 564-577 PD00066 13.92 
3.400e-09 862-875^ 


435 




ICIdIJoIJJMAJL rjsXJ 1 UiiN rZ 

SIGNATURE 


PR00456E 3.06 5.899e-09 140-155 






oircpiomyccs ouDuusiii^ijrpc uuuuiiuio 


BL00999A 14.95 7.223e-09 461-499 


435 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 9.357e-13 573-587 
PR00048A 10.52 2.421e-ll 1007-1021 
PR00048B 6.02 2.125e-10 561-133 
PR00048A 10.52 8.043e-10 314-328 
PR00048B6.02 l.OOOe-09 995-1005 
PR00048B 6.02 6.684e-09 302-312 
PR00048A 10.52 9.280e-09 167-181 


436 


PIt0024S 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 2.667e-23 100-122 
PR00245C 7.84 1.783e-14 232-248 
PR00245D 10.47 7.070e-10 268-280 


436 


PR00237 


RHODOPSIN-LIKE GPCR 


PR00237C 15.69 8.500e-ll 145-168 






SUPERFAMILY SIGNATURE 


PR00237G 19.63 6.023e-09 266-293 


436 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 2.161e-15 131-171 
BL00237D 11.23 8.091e-09 276-293 


437 


PR00262 


ILl/HBGF FAMILY SIGNATURE 


PR00262A 28.26 l.OOOe-08 80-108 


438 


BL00884 


Osteopontin proteins. 


BL00884B 12.47 l.OOOe-40 50-94 
BL00884C 22.45 6.187e-39 131-173 
BL00884A 11.35 5.846e-32 1-31 BL00884E 
11.04 8.364e-23 273-295 BL00884D8.79 
3.323e-18 255-272 


438 


PR00216 


OSTEOPONTIN SIGNATURE 


PR00216B 7.89 4.553e-34 37-67 PR00216A 

10.94 8.054e-33 2-32 PR00216C 9.63 
2.565e-32 67-93 PR00216G 12.39 8.676e-27 
238-264 PR00216H7.41 5295e-22 273-293 
PR00216F 11.79 3.133e-21 164-183 
PR0Q216D 2.74 5.800e-18 104-1 19 
PR00216E 8.44 4.405e-16 132-147 
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* Results include in order: Accession No., subtype, c-value, and amino acid position of the signature in 
the corresponding polypeptide 
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SEQ 

n> 


Piam Model 


Description 


£-value 


Score 


No: of 

Pfom 

Domains 


Position of 
the Domain 


1 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3- 
Htype 


1.8e^5 


31.6 


1 


412-438 


1 


zf-C3HC4 


Zinc finger, C3HC4 typt 

/Tt TXT/"' J* \ 

(RING finger) 


2e-05 


21.8 


1 


14-52 


3 


EMP24^GP25L 


emp24/gp25L/p24 family 


4.1e^l05 


362.6 


1 


22-235 


6 


WW 


WW domain 


l>2e-05 


32.2 


1 


45-75 


7 


WW 


WW domain 


1.2e-05 


32.2 


1 


45-75 


8 


Aa^trans 


Transmembrane amino acid 
transpoiter protein 


9.6e-64 


225.2 


1 


71-451 


9 


Fc-ADH 


Iron<ontaining alcohol 
dehydrogenase 


9.9©-35 


1243 . 


2 


4-205-.228- 

255 


10 


Fe-ADH 


Iron-containing alcohol 
ddiydrogenase 


9,9e-35 


124.5 


2 


52-253:276- 
303 


11 


Bcl-2 


Apoptosis regulator proteins, 
Bcl-2 family 


0.016 


-2.1 


1 


257-356 


12 


spectrin 


Spectrin repeat 


1.3e-10 


43.6 


3 


11-87:90" 

•# AAA A A 4 

197:200-291 


13 


RibosoxnaLLl 8ae 


Ribosomal L18ae protein 
family 


I.9e^I28 


440.1 


1 


6-176 


14 


RibosomalJ3 le 


Ribosomal protein L31e 


2.4c-47 


170.7 




72-166 


15 


zi-CCXIH 


Zinc finger C-x8-C-x5-C-x3- 
Htype 


7-8e-I6 


66.0 




342-367:371- 
396:398-420 


lo 




MYWIJ linger 


1.4e-13 


CO c 

58.5 




52-90 


1 / 


otenie 


Male sterility protein 


l.leoi 


lfl3.1 




254-440 


18 


MgtE 


Divalent cation transporter 


8.6e-39 


142.3 




138-274:352- 
4yy 


io 
ly 




Kap/ran-OAr 


1 0il 


4zo./ 




J 


4UU-300 


ly 




PDZ domain (Also known as 
uxiK or oxajfj 


2.4e-Uo 






/20-ouU 


20 


Rap_GAP 


Rap/ran-GAP 


2e-124 


426.7 


1 


400-588 




'Dn'7 
JrU^ 


PDZ domain (Also known as 
urLK or vjiAjr'j 


Z.4&-UO 


1A < 




/ZD-oUU 


22 


SCAN 


SCAN domain 


1.56-23 


91.7 


1 


165-238 






KnoijrAx' Qomaui 


Je-jo 


2U0.V 


-1 


49/-o4y 




xHJrl 


resf dr4 Homology domam 


Lze^io 


/D.4 




22-121 


23 


SH3 


SH3 domain 


2.6e-ll 


51.0 




723-777 


74 




Zrflll^UUlUUl^ UCUyUIUgCUaaCo 










25 


UDPGT 


UDP-glucoTonosyl and UDP- 

glucosyl transferas 


1.6e-84 


294.3 




26-467 


28 


RibosonialJL6e 


Ribosomal protein L6e 


4.3e-77 


269.5 




109-239 


29 


RibosomaLLll 


Ribosomal protein LI 1 


4.9e^ 


226.2 




13-144 


30 


tRNA-synCle 


tRNA synthetases class I (C) 


l.6e.l37 


470.2 




64-538 


32 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


0.00041 


17.6 




33-72:165- 
185 


34 


ras 


Ras family 


1.4e-77 


271.2 




35-235 


34 


arf 


ADP-ribosyiation fector 

family 


9.3e^ 


-56.3 




17-198 


36 


SET 


SET domain 


3.2e-05 


10.0 


1 


209>342 


36 


MORN 


MORN repeat 


0.006 


232 


3 


36-58:59- 
81:106-128 
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37 


laminin_G 


Laminin G domain 


13e-ll 


44.7 




55-174 


37 


EGF 


EGF-like domain 


0.0033 


24.1 






38 


Sema 


Sema domain 


1.7e-127 


436.9 




56-489 


38 


Plexiiurepeat 


Plexin repeat 


le-06 


35.7 




507-563 


38 


ig 


Imraunoglobulin domain 


0.0023 


15.9 




582-639 


38 


integrioJB 


Integrins» beta chain 


0.084 


6.1 




513-527 


40 


filament 


Intermediate filament protein 


1.6e-138 


473.6 


-! 1 


129-442 


41 


KeratuUB2 


Keratin, high sulfur B2 
protein 


1.8e-18 


74.8 


2 


2-138:139- 
240 


44 


sushi 


Sushi domain (SCR repeat) 


3.8e^ 


33.9 


4 


1396- 

1459:1464- 
1521:1525- 
1590:1595- 
1646 


45 


profilin 


Profilin 


4,le-13 


51.7 


1 


10-124 


47 


ubiquidn 


Ubiquitin family 


0.00033 


20.5 


1 


31-99 


48 


BTB 


BTB/POZ domain 


2.6e-21 


SA2 


1 


80-196 


48 


Kelch 


Kelch motif 


2.6e-20 


80.9 


4 


336-382:384- 

430:432- 

478:582-635 


48 


SCP 


SCP-like extracellular protein 


0.015 


13.0 


1 


1-35 


49 


serpin 


Serpin (serine protease 
inhibitor) 


2.4C-178 


605.4 


1 


59-432 


50 


T-box 


T-box 


3.6e-125 


429.2 


1 


140-331 


52 


7tilL.l 


7 transmembrane receptor 
(rhodopsin family) 




58.3 


2 


132-228:337- 

1AA 

344 


53 


CSD 


'Cold-shock' DNA-bin(ting 
domain 


1.8e-16 


63.6 


1 


42-112 


jj 




Zinc knuckle 


U.u(/Ul2 


^.O 


2 


137-154:159- 

176 






Immunoglobnlin domain 


Z.De-ll/ 


Zo./ 


1 
1 


1A IflO 


55 


Rap^GAP 


Rap/ran-GAP 


5e'18 


73.3 


1 


287-466 


5/ 


G-gamma 


^y^^*T Jk - — 

(iviLt domain 


l.oe-li 




2 


49-70:109- 




l-DOX 


1-DOX 


O t\A 1 1 il 

o.9e-l 14 




1 


101oU2 


59 


GagjjlO 


Retroviral GAG plO protein 


9.2e-06 


23.7 


1 


82-171 


61 


60sjribosoinai 


60s Acidic ribosomal protein 


0.0089 


12.0 


1 


1-22 


62 


UPARJ-Y6 


u-PAR/Ly-6 domain 


5.4e4)5 


22.3 


1 


8-51 


63 


Ribosoinal.X30 


Ribosomal protein L30p/L7e 


0.00042 


18.5 


1 


65-93 


0*1 


luaiocni 


Intermediate filament protein 


i.ie->/o 






426 


65 


Ribosomal_S6 


Ribosomal protem S6 


0.00082 


73 


1 


2-96 


66 


PDZ 


PDZ domain (Also known as 
DHRorGLGF) 


5.1e-09 


43.4 


1 


158-250 


67 


zf-C3HOl 


Zinc finger, C3HC4 type 
(RING finger) 


0.005 


14.0 


1 


92-118 


68 


G-patch 


G-patch domain 


6.8&W 


36.3 


1 


26-70 


69 


KeratiB.B2 


Keratin, high sulfur B2 
protein 


0.037 


-453 


1 


10-155 


83 




Immunoglobulin domain 


8J5e-09 


33.4 


2 


34-89:119- 
187 


86 


zf-C2H2 


Zinc finger, C2H2 type 


2.2e^71 


250.6 


17 


182-204:210- 
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£-value 


Score 


No: of 

Pfam 

Domains 
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232:237- 
260:265- 
288:315- 
337:343- 
365:369- 
392:653- 
675:681- 
704:709- 
733:741- 
764:791- 
814:820- 
842:848- 
870:877- 
899:905- 
928:952-975 


87 




Immiinoglobuliii domain 


2.7e-35 


118.7 


6 


36-121:162- 

249.292- 

375:422- 

517:564- 

oj/:/w-/yj 


88 


MAP1«LC3 


Microtubule associated 
protein lA/lB. light 


9.4e-79 


275,0 


1 


118-221 


89 


WD40 


WD domain, G-beta repeat 


1.6e-12 


55.1 


4 


173-215:221- 

ZO0.X07- 

305:1103- 

1 


90 


FKBP 


FKBP-type peptidyi-prolyl 
cis-trans isomeras 


L2e^59 


198.9 


1 


66-160 


92 


RPEL 


RPEL repeat 


6.Se-18 


15m 




576 


93 


transket^pyr 


Transketolase, pyridine 
binding domain 


4.66^65 


229.6 


1 


568-773 


93 


El^dehydrog 


Dehydrogenase El 
component 


8.7e-23 


89.1 


1 


193-504 


95 


2f-C3HC4 


Zinc fmger, C3HC4 type 
(RING finger) 


8.7e-09 


32.7 . 


1 


595-635 


97 


ig 


hmnunoglobulin domain 


1.8e-20 


71.0 


3 


31-88:127- 
185:222-278 


98 


ig 


Immunoglobulin domain 


1.8&-20 


71.0 


3 


24-81:120- 
178:215-271 


99 


Patched 


Patched family 


6.2e.06 


-369.1 


1 


66-935 


102 


zf-C2H2 


Zinc finger, C2H2 type 


^3e-94 


326.9 


12 


209-231:237- 

259:265- 

287:293- 

315:321- 

343:349- 

371:377- 

399:405- 

427:433- 

455:461- 

483:489- 

511:594^16 
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102 


KRAB 


KRAB box 


3.7e-37 


136.9 


1 


15-77 


103 


zf-C2H2 


Zinc Gnger, C2H2 type 


Ue.55 


198.2 


9 


172-195:271- 

293:299- 

321:327- 

349:355- 

377:383- 

405:411- 

433:439- 

461:467-489 


103 


KRAB 


KRAB box 


3e^ 


167.1 


1 


8-70 


107 


zf-CCHC 


^nc knuckle 




67.8 


3 


913- 

930:1293- 

1310:1358- 

1375 


107 


NTP_transL2 


Nucleotidyltransferase 
domain 


4.4e-ll 


50.3 


1 


972-1065 


108 


2f-C2m 


Tiac fingtf , C2H2 type 


1.6e-42 


154.7 


5 


283:289- 
311:317- 
339:345- 
367:373-395 


109 


myosinjhead 


Myosin head (motor domain) 


0 


1267.5 


1 


26-697 


109 


IQ 


IQ calmodulm-binding motif 


12e-17 


72.1 


4 


714-734:737- 

757:760- 

780:789-809 


110 


pkinase 


Protein kinase domain 


1^96 


334.5 


1 


20-271 


111 


WD40 


WD domain, G-beta rqpeat 


L8e-49 


177.8 


8 


161-197:218- 

253:258- 

294:300- 

335:341- 

377:383- 

428:434- 

470:476-511 


112 


SNF2.N 


SNF2 and others N-tenninal 
domain 


4^e-78 


272.9 




1-264 


112 


helicase.C 


Helicase conserved Cr 
terminal domain 


1.2e-24 


95,4 




326-410 


113 


DUF15 


Domain of unknown function 
DUF15 


0.00064 


-60.4 


1 


132-384 


114 


DSPc 


Dual specific!^ phosphatase, 
catalytic 


0.0004 


-2.9 




141-295 


114 


Y^phosphatase 


Protein-tyrosine phosphatase 


0.0037 


-26.9 




128-295 


115 


UlpLC 


Ulpl protease family, C- 
terminai catalytic d 


2.8C-52 


187.1 




394-587 


117 


Rhodanese 


Rhodanese^like domain 


le-05 


32.4 




160-260 


119 


ABCl 


ABCl family 


1.7e^0 


147.9 




318-434 


122 


protcasome 


Proteasome A-type and B- 
type 


7.4e-43 


155.8 




39-146 


124 


RibosomalJ-9 


Ribosomal protein L9 


3.1e-05 


-3.4 




94-240 


125 


RIOl 


RIO1/ZK632.3/MJ0444 
family 


7.8e-80 


278.6 




193-387 


128 


abhydrolase 


alphaA>eta hydrolase fold 


43e.20 


80.1 


1 


121-364 


129 


TPR 


TPR Domain 


4.8e-27 


103.3 


7 


355-388:473- 
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506:507- 
540:654- 
687:688- 
721:722- 
755:756-789 






HMG14 and HMG17 


1.9e-15 


64.7 


1 


2-73 


131 


bZIP 


bZIP transcriptioo 


8.3e-19 


71.7 


1 


288-352 


132 


mn 


RNA recognition motif. 


1.9e-31 


117.9 


3 


432-502:546- 
616:858-929 


133 


AMP-binding 


AMP-binding enzyme 


7.1e-117 


401.7 


1 


142-580 


138 


tubulin 


TubuIin/FtsZ family 


2.1e-151 


516.4 


1 


1-223 


141 


lamininiJBGF 


Laminin EGF-like (Domains 
fflandV) 


7.6e.l2 


52.8 


4 


252-297:300- 
348:1342- 
1391:1469- 
1530 


141 


Kelch 


Kelch motif 


1.6©.05 


31-8 


4 


654-702:760- 
918-929-990 


141 


inSegrinJB 


integnns, oeia cnain 








44-59:100- 
117:1019- 
1028 


141 


EOT 


EGF-like domain 


0.092 


19.3 


8 


167-203:207- 

235:297- 

331:496- 

533:538- 

569:1271- 

1308:1312- 

1338a478- 

1508 


142 


RUN 


RUN domain 


8e44 


159.0 


1 


31-163 


1 AO 


PVVF 


FYVE zinc fineer 


2.3C-29 


109.1. 


1 


529-593 


143 


zf-C2H2 


Zinc ftpger, C2H2 ^ 


1.7e-33 


124.7 


5 


442-464:505- 
527:533- 
555:561- 
583:589-611 


143 


BTB 


BTB/POZ domain 


1.6e-22 


88.2 


1 


30-143 


1 AA 

144 


mito^canr 


Ji/fitnf*hnnArisi\ carrier nrotein 


3.6e-61 


216.6 


3 


10-158:160- 
250:254-354 






'DiacvlelvccFol kin&se 
catalytic domain 


0.00015 


26.0 


1 


157-303 


147 


Exonuclease 


Exonuclease 


L6e^l 


151.4 


1 


228-384 


147 


rrm 


RNA recognition motif. 


9.5e-08 


392 


2 


507-574:602- 
674 


151 


WH2 


WH2 motif 


63e-20 


79.6 


3 


1194- 
1214:1234- 
1254:1322- 
1342 


154 


DHDPS 


Dihydrodipicolinate 
synthetase family 


9.1e-21 


82.4 


1 


3-270 


156 


PseudoU syntlul 


tRN A pseudouridine synthase 


le^30 


115.4 


1 


111-322 


157 


pkinase 


Protein kinase domain 


2.3e-59 


210.6 


1 


216-512 


158 


ubiquitin 


Ubiquitin family 


2.4e-05 


24.6 


1 


3-79 
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Score 
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Domains 
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the Domain 


lOU 




Tnitiation fkctar 2 siibiuiit 
family 


1.7&-98 


340.7 


1 


157^75 


161 


Beach 


Beige/BEACH domain 


l.le-224 


759.8 


1 


1470-1747 


161 


WD40 


WD domain, G-beta repeat 


2.9e-08 


40.9 


5 


1848- 

1882:1888- 
1928:1947- 
1983:2030- 
2064-.2071- 
2107 


164 


DnaJ 


DnaJ domain 


L9e-16 


68.1 




125-189 


165 


Anti_proliferat 


BTGl family 


7.4C-85 


295.3 




11-164 


166 


sugar tr 


Sugar (and other) transporter 


1.2e-78 


274.7 




34-548 


167 


sugar_tr 


Sugar (and other) transporter 


7e-52 


185.8 






168 


zf-C2H2 


2inc finger, C2H2 type 


L7e-93 


324.U 


1;5 


272:278- 

328:334- 
356-362. 
384:390- 
412:418- 
440:446- 
468:474- 
496:502- 
524:530- 
552:558-580 


168 


JvKAJtJ 


VP AP hny 


1.8e-35 


131.2 


1 


57-119 






VJ Uoi I y laiv" ui uti iii^ pi •J J 

N-terminal domain 


le-191 


636.2 


1 


1-275 


1 no 
loy 




terminal domain 


6 60*162 


551.3 


1 


277-573 


170 


cyclin 


r^vr^lin KT-tprmiivil dniTiain 


0 0(y22 


9.3 


1 


48-192 


111 

171 


TPR 


T'PP T)nmflfln 
J. r^R. A/muwiii 


9.7e-43 


155.4 


6 


133-166:167- 

200:201- 

234:282- 

315:316- 

349:350-383 


L iO 




RhoGEF domain 


3.3e-40 


147.0 


1 


166-345 


17^ 

X 




PH domain 


6^e-14 


54.5 


1 


378-483 


173 


SH3 


SH3 domain 


l.le-10 


48.9 


1 


72-126 


174 


zf-C3HC4 


Zinc finger, C3HCM type 
(RING finger) 


0.00011 


19.4 


1 


18-55 


174 




Guanylate-binding protein, C- 
tenninal domain 


0.016 


12.1 


1 


86-114 


175 


Peptidase3422 


Glycoprotease family 


2.3e-73 


257.2 


1 


1-324 


177 


TBC 


TBC domain 


4.7e-08 


10.1 


1 


57-268 


178 


1iraiismembrane4 


Tetraspanin family 


1.6e-78 


259.2 


1 


16-261 


179 


CH 


Calponin homology (CH) 
domain 


1.2e-25 


98.6 


1 


24-133 


179 


calponin 


Calponin family repeat 


L7e-14 


51.8 


1 


174-199 


182 


AP endonucleasl 


AP endonuclease family 1 


2.6e-17 


59.4 


2 


1-36:50-135 


184 


BactedaLPQQ 


P(}Q enzyme repeat 


9.3e-05 


29.2 


2 


52-89:534- 
571 
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185 


DEAD 


D£ AD/DE AH box helicase 


1.6&^0 


194.3 


1 


216-420 


185 




Helicase conserved C- 
temiinai domain 


5.9e-25 


96.3 


1 


454-540 


186 


2f-C12H2 


Ztnc finger, C2H2 type 


3.2e-24 


93.9 


6 


106-128:134- 

156:162- 

184:195- 

218:477- 

499:505-529 


187 


ciianr. tr 


Sugar (and other) transporter 


0.0014 


-90.1 


1 


272-672 


188 


tRNA^int_endo 


flU^A intron endonuclease, 
catalytic C-t 


0.0025 


-7.7 


1 


73-159 


189 


WSC 


WSC domain 


le-35 


132.1 


1 


175-254 


189 


Sulfotransfer 


Sulfotransferase protein 


4e-34 


126.8 


1 


356-586 


191 


pkiiiase 


Protein kinase domain 


5.1e-75 


262.6 


1 


148-421 


191 


PDZ 


PDZ domain (Also known as 
DHRorGLGF) 


1.3e-05 


32.1 


1 


740-827 


193 


globin 


Globin 


1.9e-26 


96.6 


1 


3-78 


195 


WD40 


wu oomam, o-Deia repeat 




59 6 


4 


64-108:116- 

153:158- 

194:288-323 


197 


BROl 


BROl-like domain 


0.0042 


-29.4 


1 


9-161 


198 


F^actUL-CapJB 


p-acun cappuig prowiiit ucm 
subunit 




759.2 


1 


1-269 


199 


aiiK 




1&-66 


235.0 


8 


40-73:82- 
114:115- 
147:148- 
180:181- 
212:213- 
246:481- 
526:527-559 


203 


PDZ 


PDZ domain (Also known as 
DHRor(5LGF) 


4.2e^ 


37.0 


1 


211-293 




SAM 


SAM domain (Sterile alpha 
motift 


1.2e-ll 


52.1 


1 


5-70 




SAM 


SAM domain (Sterile alpha 
motif) 


1.2e-ll 


52.1 


1 


5-70 


206 


zMJBRl 


Putative zinc finger in N- 
reco&nin 


4.7e.25 


96.7 


1 


978-1046 


207 


ABC trail 


ABC transporter 


2-4e-ll2 


386.6 


2 


467- 

647:1536- 
1717 


209 


2K2H2 


Zinc finew, C2H2 type 


0.00035 


27.3 


1 


200-225 


210 


UCH.2 


Ubiquitin carboxyl-terminal 
hydrolase family 


1.5e.l9 


78.4 


1 


385-454 


211 


IMP4 


Domain of unknown function 


2.2e-33 


124.3 


1 


144-297 


213 


zf-C2H2 


Zinc fmger, C2H2 type 


2.9e-08 


40.9 


3 


12-37:173- 
198:208-230 


214 


LysM 


LysM domain 


2.1e-ll 


51.3 


1 


73-116 


215 


ank 


Ank repeat 


l.le-QS 


32.3 


2 


834-867:879- 
912 


215 


TIG 


IPTmG domain 


0.009 


22.6 


1 


642-723 


217 


pyrjredox 


Pyridine nucleotide- 


1.7e-71 


251.0 


1 


196-470 
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disulphide oxidoreducta 










in 


Rieske 


Rieske [2Fe-2S] domain 


6.2e-20 


79.6 


1 


68-168 


218 


PDZ 


PDZ domain (Also known as 
DHR or GLGF) 


8.5e-19 


75.9 


1 


642-728 


219 


pkinase 


Protein kinase domain 


8.1e-67 


235.4 


1 


26-204 


220 


dsnn 


Double-stranded RNA 
binding motif 


0.095 


7.5 


1 


100-172 


221 


PHD 


PHD-fmger 


5.4e-05 


29.6 


1 


147-203 


222 


L27 


L27 domain 


6.5e-16 


66.3 


1 


13-68 


222 


SAM 


SAM domain (Steiile alpha 
motif) 


7.2e-10 


46.2 


2 


1051- 

1117:1166- 
1230 


223 


TRM 


N2,N2-dimethylguanosine 
tRNA methyltransfera 


•73e-22 


86.1 


1 


227-693 


224 


UM 


LIM domain 


5.3e-06 


33.4 


2 


124-180:183- 
243 


225 


ig 


Immunoglobulin domain 


l.le-07 


29.8 


1 


55-144 


227 


F-box 


F-box domain 


1.36-05 


32.1 


1 


11-59 


229 


Glucosamine Jso 


Glucosamine-6-phosphate 
isomerases/6- 


2.7e-158 


539.3 


1 


15-250 


231 


PTN.MK 


PTN/MR heparin-binding 
protein family 


3.6e-44 


160.2 


1 


5M48 


236 


ion_traiis 


Ion transport protein 


1.6e-22 


88.3 


1 


174-393 


238 


GNS1.SUR4 


GNS1/SUR4 family 


5.2e-46 


166.3 


1 


10-265 


240 


ubiquitin 


Ubiquitin family 


2.7e-05 


24.4 


1 


10-89 


241 


PIP5K 


Phosphatidylinositol-4- 
phosphate S-Kinase 


1.5e-155 


530.2 


1 


124-420 


242 


cadherin 


Cadherin domain 


0 


1298.9 


19 


1-75:89- 

180:194- 

290:355- 

434:448- 

549:563- 

652:671- 

774:788- 

881:896- 

988:1002- 

1092:1106- 

1192:1206- 

1295:1309- 

1379:1393- 

1489:1503- 

1594:1608- 

1699:1713- 

1808:1814- 

1910:1922- 

2016 


244 


fn3 


Fibronectin type III domain 


l^e.31 


118.6 


4 


58-140:152- 

238:249- 

333:345-426 


245 


U(l.con 


Ubiquitin-conjugating 
enzyme 


1.4e-16 


68.5 


1 


93-250 


246 


LRR 


Leucine Rich Repeat 


1.7e-14 


61.6 


6 


51-75:76- 



234 



wo 02/081731 



PCTAJS02/01222 



Table 4 



SEQ 
ID 


Pfam Model 
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99:155- 
178:181- 
203:204- 
226:227-251 


247 


lipocalin 


Lipocalin / cytosolic fatQr- 
acid binding nr 


1^-28 


102.8 


1 


164-294 


248 


R]bosoinaLS2 


Ribosomal protein S2 


2.9e-ll 


43.7 


1 


33-80 




lUUUllll 






554.2 


1 


1-277 




lUinuiii 




2.4e-212 


718.8 




1-351 


251 


ATP-synt_ab 


ATP synthase alpha/beta 

irtiimjf ^ iiuv^mji 


1^-75 


264.8 


1 


138-346 


251 


ATP«synLab_C 


ATP synthase alpha/beta 


2.7C.38 


140.6 


1 


348^56 


251 


ATP-synt_ab_N 


ATP synthase alpha/beta 

Lou Ail jf 9 uvva'iMi 


5.4e-19 


76.5 


1 


67-135 






family, nucleot 


1 3e-70 


248.0 


1 


138-344 






ATP «vnthfl<f> nlnhfl/hAtfl 

family, beta-ba 




76^ 


1 


67-135 






(RING finger) 




43.2 


1 


39-79 






0-r>flfph Hnmflin 


1.3e-08 


42.1 


1 


410-456 


955 




Calnonin hnmoloffv fCHi 
domain 


1.6e-ll 


51.7 


1 


24-134 


256 


RF-1 


Peptidyl-OINA hydrolase 
domain 


5.9e-66 


2323 


1 


225-338 


957 




Pentidvl-tRNA hvdrolase 
domain 


5.9e^ 


232.5 


1 


189-302 


95fi 






4.4e-18 


73.5 


1 


189-304 


950 


uuijrou 


ThiofiedrfeYin 

X IIIUtWUl/AAU 


2e-09 


35.7 


2 


119-165:662- 
695 


260 


thyioglobuliiul 


Thyioglobulin type-1 repeat 


3Je-34 


127.2 


2 


95-158:227^ 
292 


260 


kazal 


Kazal-*type serine protease 
inhibitor 


9.3e^7 


35.9 


1 


43-87 


262 


DnaJ 


DnaJ domain 


4.1e-15 


63.6 


1 


277-338 


263 


WD40 


WD domain. G-beta reocat 


4e-21 


83.6 


5 


3-42:49- 
86:97- 
133:142- 
178:184-220 


265 


DUF6 


Integral membrane protein 
DUF6 


0.083 


9.1 


2 


81-316:338- 
470 


266 


RibosomaLL31e 


Ribosomal protein L31e 


1.7e-61 


217.7 


1 


15-109 


268 


F5 JF8 type C 


F5/8 type C domain 


2,4e.65 


230.5 


1 


42-196 


268 


Zn^caibOpept 


2^0 carboxypeptidase 


3.5e-50 


180.1 


2 


224-341:400- 
600 


270 


BTB 


BTB/POZ domain 


7.7e^l8 


72.7 


1 


8-119 


270 


rf-C2H2 


Zinc finger, C2H2 ^pe 


4.2e-13 


57.0 


4 


254-276:363- 

385:390- 

412:448-468 


271 


Glycos.transLl 


GlycosyJ transferases group 1 


0.027 


12.8 


1 


291-385 


272 


HEAT 


HEAT repeat 


2.2e-07 


38.0 


3 


237-275:276- 



235 
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315:674-712 


273 


HEAT 


HEAT repeat 


2.2e^ 


38.0 


3 


237-275:276- 
315:640-678 


275 


SPRY 


SPRY domain 


2.6e-34 


127.4 


1 


390-515 


275 


2f-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


le-16 


58^ 


1 


29-69 


277 


BTB 


BTB/POZ domain 


6e.27 


103.0 


1 


36-149 


277 


Kelch 


Kelch motif 


9.7e-21 


82.3 


4 


331-390:392- 

441:443- 

493:540-586 


Z/o 




7{nr fintrer C2T^ tvoe 


4.1e-116 


399.2 


14 


193-215:221- 

243:249- 

271:277- 

299:305- 

327:333- 

355:361- 

383:389- 

411:417- 

439:445- 

467:473- 

495:501- 

523:529- 

551:557-579 


229 


SCAN 


SCAN domain 


2.4e^52 


187.3 


1 


36-132 


229 


zf-C2H2 


Zinc finger, C2H2 type 


2.4e-51 


184.0 


7 


348-370:375- 

397:403- 

425:431- 

453:459- 

480:486- 

508:514-537 


231 


ZlD 


77P Zinc transporter 


6.6e-20 


79.6 


1 


M46 




NTP tran^ 2 


NucleotidyitransfiKase 
domain 


8.5e-13 


553 


1 


67-174 


286 


zf-C2H2 


Zinc finger, C2H2 type 


2.8e-93 


323.3 


12 


118-140:146- 

168:174- 

196:202- 

224:230- 

252:258- 

280:286- 

308:314- 

336:342- 

364:370- 

392:398- 

420:426-448 


286 


KRAB 


KRAB box 


3.6e-38 


140.2 


1 


8-70 


287 


zf-C2H2 


Zinc fingo*. C2H2 type 


5.3e.l24 


425.4 


17 


183-205:211- 

233-.239- 

261:267- 

289:295- 

317:323- 

345:351- 

373:379- 



236 
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ftieDom^ 














401:407- 
429:435- 
457:463- 
485:491- 
513:519- 
541:547- 
569:575- 
597:603" 
625:631-653 


289 


DiHfoIate red 


Dihydrofolate reductase 


74e-77 


268.8 


1 


4-185 


291 


PDZ 


PDZ domain (Also known as 
DHRorGLGF) 


7.4e-17 


69.4 


1 


5-84 


293 


PH 


PH domain 


1.4e-08 


35.5 


1 


44-147 


294 


adh short 


short chain dehydrogenase 


3,9e-29 


110.2 


1 


36-284 


297 


PKD 


PKD domain 




42.4 


2 


663-753:756- 
839 


297 


BNR 


BNR repeat 




34,1 


5 


115-126:156- 
167:351- 
362:428- 
439:470-481 


300 


HMGJjox 


HMG (high mobility group) 
box 


5.4e.05 


20.0 


1 


245-304 


301 


ig 


Immunoglobulin domain 


0.05 


11.6 


1 


629-688 


302 


zf-C3HC4 


2nc finger, C3HC4 type 
(RING finger) 


5e-12 


43.2 


1 


39-79 


303 


START 


START domain 


0.015 


4.1 


1 


1790-1994 


304 


integrase 


Integrase DNA bindiii^ 
domain 


7.2e^ 


32.9 


1 


51-96 


305 


myosiiuhead 


Myosin head (motor domain) 


7.6e-279 


939.7 


2 


11-668:689- 
733 


306 


zf-C2H2 


Zinc finger, C2m type 


83e-54 


192.1 


7 


66-88:94- 

116:122- 

144:150- 

172:178- 

200:280- 

303:317-339 


307 


ig 


Immimoglobulin domain 


0.00023 


19.1 


2 


35-104:136- 
194 


309 


ras 


RasfamDy 


0.00079 


-93.3 


1 


38-176 


310 




Immimoglobulin domain 


2.1e-06 


25.7 


1 


37-112 


311 


EFIBD 


BF-1 guanine nucleotide 
exchange domain 


4.7^.56 


199.6 


1 


139-225 


312 


BTB 


BTB/PQZ domain 


8.4e-25 


95.8 


1 


51-164 


313 


zf-C2H2 


Zinc finger, C2H2 type 




208.9 


9 


118-140:197- 

219-.281- 

303:309- 

331:337- 

359:365- 

387:393- 

415:421- 

443:449-471 


313 


KRAB 


KRAB box 


1.4e-17 


71.8 


1 


41-99 



237 
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314 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


0.045 


8.2 


1 


213-671 


315 


cNMP„binding 


Cyclic nucleotide-binding 
domain 


4e-26 


100.2 


1 


387-475 


315 


ion_trans 


Ion transport protein 


3.8e-19 


77.0 


1 


69-290 


316 


Peptidase_S26 


Signal peptidase I 


2.8e-16 


56.3 


2 


38-98:117- 
139 


317 


2f-C2H2 


Zinc fmger, C2H2 type 


4.3e-56 


199.8 


9 


156-178:184- 

206:212- 

234:240- 

262:268- 

290:296- 

318:324- 

346:352- 

374:378-400 


317 


KRAB 


KRAB box 


6.7e-16 


66.3 


1 


11-73 


319 


UPF0073 


Uncbaracterised imtein 
family 


1.8e-09 


27.9 


1 


33-276 


320 


EGF 


EGF-like domain 


4.7e-08 


40.2 


1 


26^59 


321 


lectiojc 


Lectin C-type domain 


8.6e*15 


62.6 


1 


268-374 


325 


MAM 


MAM domain 


1.3e-52 


188^ 


1 


338-503 


325 




Immunoglobulin domain 


L9e-15 


54.8 


3 


41-101:138- 
202:346-420 


327 


MAM 


MAM domain 


5.3e-180 


611.4 


4 


26-169:170- 

329:342- 

498:509-666 


328 


Sema 


Sema domain 


1.5e-211 


7162 


1 


56-491 


329 


zf-C2H2 


Zinc finger* C2H2 type 


L5e-84 


294.3 


13 


170-192:198- 

220:226- 

248:254- 

276:282- 

304:310- 

332:338- 

360:366- 

388:394- 

416:422* 

444:450- 

472:478- 

500:506-528 


331 


PAP2 


PAP2 superfamily 


8©-22 


85.9 


1 


160-314 


332 


LRR 


Leucine Rich Repeat 


3.46-36 


133.7 


11 


58-81:82- 

105:106- 

129:130- 

153:154- 

177:178- 

201-.202- 

225:250- 

273-.274- 

297*.298- 

321:322-345 


332 




Immunoglobulin domain 


2.5e-08 


31.9 


1 


425-485 


332 


LRRNT 


Leucine rich repeat N* 


2.5e-05 


31.1 


1 


27-56 
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terminal domain 










332 


LRRCT 


Leucine rich repeat C- 
terminai domain 


0.0029 


24.3 




355-408 


333 


AdoHcyase 


S-ad^osyi-L-homocysteine 
hydrolase 


L5e-280 


945.4 




214-640 


334 


TBC 


TBC domain 


9.4e-38 


138.9 




89-302 


341 


WD40 


WD domain, G-beta lepeat 


0.00094 


25.9 


1 


2-32:109-146 


342 


ABCI 


ABCI family 


0.051 


-29.9 


1 


3-50 


344 


globin 


Globin 


3e-45 


162.2 


1 


M41 


345 


globin 


Globin 


7.5e-39 


139.9 




1-31:68-179 


347 


F-box 


F-box domain 


1.5e-07 


38.5 


1 


24-72 


348 


HLH 


Helix-loop-helix DNA- 
binding domain 


2e^8 


41.4 




83-137 


349 


KRAB 


KRAB box 


^76-39 


144.0 




4-66 


350 


UCH-2 


Ubiquitin carboxyl-tarminal 
hydrolase family 


L7e-19 


78.2 


1 


645-705 


350 


UCH-1 


Ubiquitin carboxyl-tennioal 
hydrolases famil 


9.1e-15 


62.5 


1 


363-394 


350 


zf-UBP 


Zn-finger in ubiquitin- 
hydrolases and other 


0.00069 


18.9 


1 


236-306 


351 


NUDK 


MutT-like domain 


8.2e-12 


52.7 




50-200 


352 


IBR 


IBR domain 


1.6C-12 


55.0 




101-166 


353 


BR 


IBR domain 


1.6©- 12 


55.0 


1 


66-131 


354 


SCP 


SCP-like extracellular protein 


1.4e-34 


128.3 


1 


56-208 


356 


mtojcm 


Mitochondrial earner protein 


9.7e.78 


271.7 




10-125:127- 
220:232-321 


358 


UCH-1 • 


Ubiquitin carboxyl-tenninal 
hydrol^es famil 


5.1e-15 


63.3 




323-354 


358 


zf-UBP 


Zn-finger in ubiquitin- 
hydrolases and other 


0.00049 


19.4 




195-264 


360 


Phagejysozyme 


Phage lysozyme 


0.0014 


23.4 




94-184 


362 


RibosomaLS2 


Ribosomal protein S2 


3.3e-08 


32.9 




20-62 


364 


2f4:3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


5.3©-09 


33A 




291-329 


365 


2f-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


0.0096 


13.1 




109-148 


367 


TPR 


TPR Domain 


0.043 


20.4 




1-28 


370 


zf-C2H2 


Zinc finger, C2in Q(pe 


5.3e-109 


3753 


14 


127-149:155- 

177:183- 

205:211- 

233:239- 

261:267- 

289:295- 

317:323- 

345:351- 

373:379- 

401:407- 

429:435- 

457:463- 

485:491-513 


370 


SCAN 


SCAN domain 


4^38 


140.0 


1 


27-122 


371 


arf 


ADP-ribosylation factor 


4.9e-39 


143.1 


1 


6-184 



239 
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Pfam 

Domains 


JrOSluOn 01 
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family 










371 


ras 


Ras family 


7.2e-06 


-70.1 


1 




372 


BNR 

- 


BNR repeat 


0.031 


20.9 


3 


171-182:244- 
Zjj.zyD-juo 


373 


zf-C2H2 

* 


Zinc finger, C2H2 type 


83e-25 


95.8 


5 


14Z-iDZ:l 11" 

ZZo.ZJ*f- 

258:264-288 


376 


mn 


RNA recognition motif. 


0.00019 




1 


i IZ-IOJ 


377 


mn 


RNA recognition motif. 


2.2e-19 


77.9 


1 




380 


vwc 


von Willebrand factor type C 
domain 


1.6e-31 


-f 1 o o 

118.2 


3 


L2r /o: /y- 
134:137-192 


381 


Ribosonial_iL35Ae 


Ribosomal protein L35Ae 


0.00013 


7.0 


1 


1-79 


385 


ras 


Ras family 


3.9e-63 


223.2 


1 


35-229 


385 


arf 


ADP-nbosylation factor 
family 


1.7e-05 


-46.9 


1 


18-202 


388 


F-box 


F-box domain 


lJe-05 


31,9 


2 


23-70:99-146 


390 


SPRY 


SPRY domain 


6.2e-10 


46.4 


1 


101-239 


391 


tRNA.Me trans 


tRNA methyl transferase 


1.9e-19 


50.9 


1 


5-185 


392 


zf-C2H2 


Zinc finger, C2H2 type 


4e-17 


70.3 


3 


175-197:203- 
225:231-253 


393 


SCAN 


SCAN domain 


3.1e-39 


143.8 


1 


389-484 


393 


SPRY 


SPRY domain 


1.8e.l9 


78.1 


1 


148-273 


393 




Zinc finger, C2H2 type 


4e-Q9 


43.7 


2 


759-781:787- 
809 


393 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


0.0032 


14.7 


1 


11-52 


394 


Kelch 


Kelch motif 


4e"53 


189.9 


5 


329-375:377- 
431:433- 
479:481- 
525:527-572 


394 


BTB 


BTB/POZ domain 


6.1©-26 


99.6 


1 


30-144 


395 


C2 


C2 domain 


2^80 


280.4 


2 


159-251:296- 
384 


396 


ank 


Ank repeat 


5.6e-33 


123.0 


4 


47-79:80. 
112:14U- 
174:175-207 


396 


PH 


PH domain 


8.96-05 


22.0 


1 


236-334 


397 


ank 


/\nK repeal 




101.4 


4 


17-49-30- 

82:83- 

115:116-148 


398 


Nucleoplasmin 


Nucleoplasmin 


3.6e-29 


110.4 


1 


13-209 


400 


DAGKa 


Diacylglycerol kinase 
accessory domain 


1.9e-124 


426.8 


1 


598-778 


400 


DAGKc 


Diacylglycerol kinase 
catalytic domain 


7,le-67 


235.6 


1 


454-578 


400 


DAGJ»E-bind 


Phorbol esters/diacylglycerol 
binding dom 


2.9e"23 


90.7 


2 


261-310:326- 
374 


400 


etfaand 


EFhand 


2.4e-12 


54,4 


2 


169-197:214- 
242 


403 


PDZ 


PDZ domain (Also known as 


7.7e^6 


165.7 


3 


86-166:210- 



240 
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DHRorGLGF) 








291:821-907 


404 


2f-C2H2 


21inc finger, C2H2 Qpe 


2.6e-48 


173.9 


7 


172-194:200- 

222:228- 

250:256- 

278:284- 

306:312- 

331:340-362 


405 


K^tetni 


K-f channel tetraro^sation 
domain 


2,6e-23 


90.9 


1 


51-146 


406 


SNF 


Sodiumrneurotransmitter 
symporter family 


0 


1268.7 


1 


60-657 


407 


is 


Immunoglobulin domain 


l.le-06 


26.5 


1 


53-120 


408 


DnaJ 


DnaJ domain 


23e-27 


104.3 


1 


4-68 




DnaJ C 


DnaJ O temiitial region 


3.1e-08 


38.1 


I 


192-314 


409 


mito.carr 


Mitochondrial carrier protein 


1.4e-57 


204.7 


3 


5-100:102- 
201:205-302 


410 


2f-C2H2 


Zincfinffcr C2H2tVDe 


5.2e-97 ' 


335.7 


12 


141-163:169- 

191:197- 

219:225- 

247:253- 

275:281- 

303:309- 

331:337- 

359:365- 

387:393- 

415:421- 

443:449-473 


411 


SJOO 


S-lOO/ICaBP type calcium 
binding domain 


9.7e-13 


55.8 


1 


5-48 


411 


efband 


EFhand 


0.0012 


25.6 


1 


54-82 


413 


fa3 


Fibronectin type in domain 


8.6e*14 


59.3 


2 


22-107:119- 
196 


413 


PHD 


PHD-finger 


9.6e-05 


27.2 


1 


285-341 


414 




Zinc fing^, C2H2 type 


23e-27 


104.4 


6 


42-64:117- 

139:145- 

167:173- 

196:534- 

556:573-595 


415 


SPRY 


SPRY domain 


3.9e-18 


73.7 


1 


347-467 


415 


zf-C3HC4 


Zinc finger. C3HC4 type 
(RING finger) 


4.4e-14 


49.9 


1 


16-56 


415 


zf-B_box 


B-box zinc finger 


9e^7 


35.9 


1 


92-133 


416 


pkinase 


Protein kinase domain 


L2e-54 


195.0 


1 


97-317 


417 


trypsin 


Trypsin 


4.6e-38 


122.5 


1 


41-234 


418 


Glypican 


Glypican 


5.7e-131 


448.5 


1 


3-244 


419 


Keratin JB2 


Keratin, high sulfur B2 
protein 


0.0013 


-23.4 


1 


37-159 


420 


Dyaeiiuheavy 


Dynein heavy chain 


0 


1432.3 


1 


309-1019 


421 


zf-C2H2 


Zinc finger, C2H2 type 


0.00039 


27.2 


3 


75-99:203- 
227:266-290 


422 




Immunoglobulin domain 


0.00074 


17.5 


1 


34-107 


423 


fo3 


Fibronectin type III domain 


6e^8 


39.8 


1 


443-531 
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Table 4 



SEQ 
ID 


Pfam Model 


Description 


&value 


Score 


No* of 
Pfam 


the Domain 


424 


Keratm_B2 


Keratin, high suinir B2 
protein 


U.UUZo 






5-150-152- 
251 


425 


pkinase 


Protein kinase domain 


2.3e^55 


197.3 


1 


69-390 


426 




Immunoglobulin domain 


4.1e^09 


34.4 


1 


35-112 


427 


GalactosyLT 


Galactosyltransferase 


2.6e-35 


130.8 


1 


158-349 


428 


proteasome 


Proteasome A-type and B- 


5^e-28 


106.4 


1 


96-238 


429 


7tmJ 


7 transmembrane receptor 
(rhodopsin fanuly) 


3.4e-38 


123.5 


1 


41-290 


430 


BTB 


BTB/POZ domain 


8.1e-23 


89.2 


1 


58-173 


430 




Zinc finger, C2H2 type 


4.3e-07 


37.0 


2 


472-494:500- 
523 


433 


P450 


Cytochrome P450 


6.4e-175 


594.5 


1 


33-493 


434 


sugar tr 


Sugar (and other) transporter 


2.6e-64 


227.1 


1 


10-512 " 


435 


2f-C2H2 


Zmc finger, C2H2 type 






Q 


287-309*315- 

337:546- 

568:574- 

596:606- 

628:844- 

866:872- 

894:980- 

1002:1008- 

1030 


436 


7tm^l 


7 transmembrane receptor 
f rhodopsin family) 


2.2e-40 


130.4 


2 


82-221:229- 
284 


437 


FGF 


Fibroblast growth factor 


4.6e-14 


51.6 


1 


48-129 


438 


Osteopontin 


Osteopontin 


3.7e-181 


615.2 


1 


1-294 
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PDB annotation 


TRANSCRIPTION 
REGULATION PROTO- 
ONCOGENE. NUCLEAR 
BODIES (PODS), LEUKEMIA, 
2 TRANSCRIPTION 
REGULATION i( 


TRANSFERASE HRS; HRS, N|j 
VHS. FYVE. ZINC FINGER, 
SUPERHEUX 


METAL BINDING PROTEIN 
RING FINGER PROTEIN 
MATl; RING FINGER 
(C3HC4) 




COMPLEX 

j (ISOMERASE/DIPEPTIDE) 
PINl; PEPTIDYL-PROLYL 
CIS-TRANS ISOMERASE, 
ROTAMASE, 2 COl^LEX 

1 (ISOMERASE/DIPEPTIDE) 

! CONECT 




COMPLEX V 
(APOPTOSIS/PEPTIDE) fi 
APOPTOSIS, ALTERNATIVE ^ 
SPLICING, COMPLEX 
(APOPTOSIS/PEPTIDE) ^ 


APOPTOSIS HELICAL T. 
PROTEIN g 


APOPTOSIS APOPTOSIS 
REGULATOR BCL-X; g 
APOPTOSIS. PROGRAMMED! 
CFJ J, DEATH, BCL-2 „ 
FAMILY b 


ST 


1 STRUCTURAL PROTEIN P4 


Compound 


TRANSCRIPTION FACTOR 
PML; CHAIN: NULL; 




HEPATOCYTE GROWTH 
FACTOR-REGULATED 
TYROSINE CHAIN: A; 


CDK-ACnVATING 
KINASE ASSEMBLY 
FACTORMATl; CHAIN: A; 




Oh ^ ^ 

Q to 12 
S P C 


DIPEPTIDE; CHAIN: B; 




BCL-XL;CHAIN:A; BAK 


PEPTIDE; CHAIN: B; 


APOPTOSIS REGULATOR 
BAX. MEMBRANE 
ISOFORM ALPHA; CHAIN: 
A: 


t 
ft 


i 




1 ALPHA SPECTRIN: CHAIN: 


SEQFOLD 
score 






















1 81.85 


PMF 
score 


0.01 


0.76 


i 0.62 




0.23 




o 
d 


0.10 


0.03 






Verify 
score 


00 

o 

1 


-0.23 


-0.44 




-0.45 






-0.07 


-0.28 


-0.03 






Psi 
Blast 


§ 

<s 


0.0061 


1.3e-06 




s 

6 

00 




00 

cn 




1 

cn 




1 le-21 




8 


pi 






VO 

00 




<s 

VO 
CO 




VO 
CO 




r* 


START 
AA • 


o 




o 

f— 1 










i 






OV 


CHAIN 
ID 




< 


< 




< 




< 








< 


gs 


Ibor 


j Idvp 






Ipin 




Ibxl 


9UI 


Itnaz 




\ Icun 


SEQID 
NO: 










VO 








1-1 




OJ 

«-4 
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a 

1 

e 



03 

I 





1 



U 
PQ 





I I I I I I I I 
^ 0 & T.^ 




OoaoKOuOwoaonoiSon 
wOt-OtCowotIotIobjOtIow 



1 



s 

(A 



S 

o 



r 



si 



1^ 



CM 



5? 



U 



Pa 



9 

cr 



a 

CO 
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3 s 2 a ;s S 





"PTC: 




>4 



e 

o 
U 




^ ^ ^ ^ 

..S S S PJ 




■ CO t1 CO 7l CO tI CO 
O W O W O W O 





i 

CO 



CO 



s 



CO 
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I— I I— I 

J B »o s 
o ON 5 5 




* O m 5 ^ O O 



o 
o 




9 9 .^9 9 Q o o Q Q..o_Q^'q 




or. owot^ot^otIowOtIot. opqOwo<o 



i 




CO 



ti 

> CO 



CO 



£6 



CO 
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PDB annotation 






CHAPERONE HOP. TPR- 
DOMAIN. PEPTIDE- 
COMPLEX.HFJJCAL 
REPEAT. HSP90. 2 PROTEIN M 
BINDING S 


SIGNALING PROTEIN 
PEROXISMORE RECEPTOR 
1. PTSl-BP, PEROXIN-5, PTSl 
PROTEIN-PEPTIDE 
COMPLEX, 

TETRATRICOPEPTIDE 
REPEAT. TPR, 2 HELICAL 
REPEAT 




LYASE EPIMERASE. 
DEHYDRATASE. 
DEHYDROGENASE. LYASE 


ISOMBRASE EPIMERASE; 
UDP-GALACTOSE, 
EPIMERASE. ISOMBRASE 




TRANSFERASE MURG; O 
ROSSMANN FOLD 




AMINOACYL-TRNA 'Zl 
SYNTHETASE METRS; g 
AMINOACYL-TRNA UJ 
SYNTHETASE, ROSSMANN \ 

FOLD e 


< 

u K 

s 
pg 

5 CO 


2" 




Compound 


RIBOSOMAL PROTEIN L6; 
CHAIN: 1; 




TPR2A-D0MAIN OF HOP; 
CHAIN: A; HSP90-PEPTIDE 
MEEVD; CHAIN: B; 


PEROXISOMAL . 
TARGETING SIGNAL 1 
RECEPTOR; CHAIN: A. B; 
PTSl-CONTAINING 
PEPTIDE; CHAIN; C, D; 




DTDP-GLUCOSE4.6- 
DEHYDRATASE; CHAIN: 
A,B; 


UDP-GALACTOSE-4- 
EPIMERASE; CHAIN: 
NULL; 




UDP-N- 1 
ACETYLGLUCOSAMINE- 
N-ACETYLMURAMYL- 
CHAIN:A.B: 




METHIONYL-TRNA 
SYNTHETASE; CHAIN: 
NULL; 


GLUTAMYL-TRNA 
SYNTHETASE; IGLN 4 
CHAIN: NULL IGLN 5 




i 

1 
1 


SEQFOLD 
score 












55.95 


72.32 








103.94 


74.49 






PMF 
score 






*-« 
d 


0.04 










0,59 










1 0.95 


Verify 
score 






0.04 


S 
9 










0.19 










10.18 


Psi 
Blast 






f— • 
1 

«> 


5.1e.l7 




le-72 


1 




5.1e-20 




1.7e-57 


o 

1 

oo 




1 5.1e-ll 










o 




NO 

cn 


>o 

fO 




o 

rr 




CO 

>o 
m 


r*- 

CO 

m 






START 
AA 












1— • 


1^ 




oo 






CO 




CO 
CO 


CHAIN 
ID 






< 


•< 




< 






-< 














Is 






lelr 


Ifch 




Ibxk 


ludb 




1— i 




oo 
8a 

— H 






Ichc 


SEQID 
NO: 






>o 
1— • 






r- 










o 


o 

CO 




cs 

CO 
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PDB annotation 

1 




LIGASE CBU UBCH7, ZAP- 
70. E2. UBIQUmN.E3. 
PHOSPHORYLATION. 2 1 
TYROSINE KINASE, A 
UBIQUITINATION, PROTEIN^ 
DEGRADATION. 


UOASE CBL, UBCH7, ZAP- 
70,E2.UBIQUTnN,E3. 
PHOSPHORYLATION, 2 
TYROSINE KINASE, 
UBIQUITINATION, PROTEIN 
DEGRADATION, 


METAL BINDING PROTEIN 
RING FINGER PROTEIN 
MATI; RING FINGER 
rC3HC4) 


METAL BINDING PROTEIN 
RING FINGER PROTEIN 
MATI; RING FINGER 
(C3HC4) . 


DNA-BINDING PROTEIN J 
V(D)J RECOMBINATION A 
ACTIVATING PROTEIN 1; jl 
RAG1.V(D)J ^ 
RECOMBINATION, U 
ANTIBODY, MAD, RING Q 
FINGER, 2 ZINC BINUCLEA]p 
CLUSTER, ZINC FINGER, V 
DNA-BINDING PROTEIN 


DNA-BINDING PROTEIN H 
V(D)J RECOMBINATION ^ 
ACTIVATING PROTEIN 1 ; it 
RAGI, V(D)J fJL 
RECOMBINATION, ^ 


Compound 


VmUS-l (C3HC4, OR RING 
DOMAIN) ICHC 3 (NMR. 1 
STRUCTURE) ICHC 4 


SIGNAL TRANSDUCTION 
PROTEIN CBL; CHAIN; A; 
ZAP.70 PEPTIDE; CHAIN: 
B;UBIQUmN- 
CONJUGATING ENZYME 
E12-18KDAUBCH7; 
CHAIN: C; 


SIGNAL TRANSDUCTION 
PROTEIN CBL; CHAIN: A; 
ZAP.70 PEPTIDE; CHAIN: 
B;UBIQUrnN- 
CONJUGATING ENZYME 
E12-18KDAUBCH7; 
CHAIN: C; 


il 


FACTOR MATI; CHAIN: A; 


CDK-ACTIVATING 
KINASE ASSEMBLY 
FACTOR MATI; CHAIN: A; 


t 




RAGI; CHAIN: NULL; 


SEQFOLD 
1 score 














60.57 




score 




0.33 


0.93 


0.25 


0.51 


1.00 






score 




0.12 


0.73 


0.05 


-0.26 


0.25 






1 Blast 




2.6e-13 


o 


m 


ro 


3.4e-20 


3.4e-20 






OQ 


r- 


00 


<N 

00 


^ 


^ 
«— 4 


START 
AA 




CO 


m 
<** 




CO 
CO 




OS 


1 CHAIN 1 


a 




<' 


< 


< 


< 






la 




Ifbv 


\ Ifbv 






Imtd 


Irmd 


B 


NO: 






cn 


m 


CO 


CO 


CO 
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PDB annotation 


ANTIBODY, MAD, RING 
FINGER, 2 ZJNC BINTia-HAR 
CLUSTER, ZINC FINGER, 
DNA-BINDING PROTEIN 




STRUCTUKAL PROTEIN 1 
TWO REPEATS OF M 
SPECTRIN, ALPHA HELICAlS ! 
LINKER REGION. 2 2 
TANDEM 3-HRr,TX COILED- 
COILS. STRUCTURAL 
PROTEIN 


ENDOCYTOSIS/EXOCYTOSI 
S NSECl; PROTELN-PROTEIN 
COMPLEX, MULTI-SUBUNTT 


ENDOCYTOSIS/EXOCYTOSI 
S SYNAPTOTAGMIN 
ASSOC3ATHD35KDA 
PROTEIN, P35 A, THREE 
HELIX BUNDLE 


PROTBIN TRANSPORT 
HRF ra-TURN-HELK TPR- 
LIKE REPEAT, PROTOIN 
TRANSPORT . f1 


c 


SIGNALING PROTEIN GTP- ^ 
BINDING PROTEINS. ^, 
PROTEIN-PROIEIN j| 
COMPLEX, EFFECTORS 

o 


GTP-BINDING PROTEIN fH 
GTP-BINDING PROTECT, v 
SMAULG PROTEIN, RAP2, « 
GDP.RAS H 


COMPLEX (GTP- ^ 
BINDING/EFFECTOR) RAS- fU 
RELATED PROTEIN RABSAfW 
COMPLEX (GTP. fill 


Compound 






ALPHA SPECTRIN; CHAIN: 
A, B.C; 


SYNTAXIN BINDING 
PROTEIN 1; CHAIN: A; 
SYNTAXIN 1 A; CHAIN: B; 


SYNTAXIN-1 A; CHAIN: A, 
B,C; 


VESICULAR TRANSPORT 
PROTEIN SEC17; CHAIN: 
A; 




RAS-RELATED PROTEIN 
RAP-IA; CHAIN: A; 
PROTO-ONKOGENE 
SERINE/THREONINE 
PROTEIN KINASE CHAIN: 
B; 


RAP2 A; CHAIN: NULL; 


RAB-3A; CHAIN: A; 
RABPHILIN-3A; CHAIN: B; 


SEQFOLD 
score 
















123.90 


117.80 


152.38 


PMF 
score 






0.04 


0.27 


-0.09 


0.04 










Verify 
score 






0.36 


-0.03 


0.09 


0.19 










Psi 
Blast 






A 
o 
ei 


00 

T-H 

i> 

so 


vo 


S 




S 
& 
•» 


00 

vo* 


oo 

1 

NO 








Pi 


vo 

VO 




JS 






1 

00 




START 
AA 






CM 


r- 








<s 


cn 


o 
cn 


CHAIN 
ID 






< 


0Q 


< 


< 




< 




< 








Icun 


Idnl 


8 


Iqqe 




Icly 


Ikao 


Izbd 


SEQID 
NO: 






m 
m 


m 
m 


ro 
m 


m 
m 
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P— H 

H 
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PDB annotation 


UDA; LECTIN. HEVEIN 
DOMAIN, UDA, 
SUPERANTIGEN 


SUGAR BINDING PROTEIN 
UDA; LECTIN. HEVEIN 
DOMAIN, UDA, 

SUPERANTIGEN J| 


SUGAR BINDING PROTEIN ^5 ' 
UDA; LECTIN, HBVEIN 
DOMAIN, UDA. 
SUPERANTIGEN, 
SACCHARIDE BINDING 


SUGAR BINDING PROTEIN 
UDA; LECTIN, HBVEIN 
DOMAIN, UDA, 
SUPERANTIGEN. 
SACCHARIDE BINDING 


SUGAR BINDING PROTEIN 
UDA; LECTIN, HEVEIN 
DOMAIN, UDA, 
SUPERANTIGEN, 
SACCHARIDE BINDING 


GLYCOPROTEIN 
GLYCOPROTEIN *Q 






Compound 


VI/AGGLUTININ 
ISOLECTIN V; CHAIN: A; 


AGGLUTININ ISOLECTIN 
VI/AGGLUrmiN 
ISOLECTIN V; CHAIN: A; 


AGGLUTININ ISOLECTIN 
I/AGGLUTININ ISOLECTIN 
V/ CHAIN: A; 


AGGLUTININ ISOLECTIN 
I/AGGLUTININ ISOLECTIN 
V/ CHAIN: A; 


AGGLUTININ ISOLECTIN 
I/AGGLUTININ ISOLECTIN 
V/ CHAIN: A; 


LAMININ; CHAIN: NULL; 


AGGREGATION 

iNHmrroR, gp 

ANTAGONIST KISTRIN 
(NMR. 8 STRUCTURES) 
IKST 3 


FACTOR KA; CHAIN: C, 
L.; D-PHE-PRaARG; 
CHAIN: I; 


q 




















SEQFOI 
score 












70.40 






m . 


PMF 
score 




0.41 


0.62 


0.11 


0.03 




0.00 




^2 

O 




o 


t-H 


-0.71 


-0.05 




-0.45 




> 




9 


«^ 






PsI 
Blast 




OO 

9 

CO 
1— • 


1.3e-10 


6 


o 


vd 










s 




? 




t-H 


»n 




START 
AA 




O 


cn 

•-4 




m 

00 




On 




CHAIN 
ID 




< 




< 


< 














leis 


•a 

8 


1 

»— 1 




Iklo 


Ikst 


p. 


SEQID 
NO: 






















5 


5 


f— • 




5 


5- 




5 
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PDB annotation 


COMPLEMENT INHIBITOR 
SP35, VCP, VACCINL\ VIRUS 
SP35; COMPLEMENT 
INHIBITOR, COMPLEMENT 
MODULE, SCR, SUSHI 
DOMAIN, 2 MODULE PAIR 


COMPLEMENT INHIBITOR 
SP35, VCP, VACCINIA VIRU3M 
SP35; COMPLEMENT 
INHIBITOR, COMPLEMENT 
MODULE, SCR, SUSHI 
DOMAIN, 2 MODULE PAIR 








PC 




PROTEIN BINDING ACTIN-W? 
BINDING PROTEIN, 0 
ALLERGEN fij 


CONTRACTILE PROTEIN \ 
ACIDIC PROFILIN ISOFORI^ 
ACTIN-BINDING PROTEIN.^ 
POLY-L- 2 PROLINE ^ 
BINDING PROTEIN, fy 
CONTRACTILE PROTEIN fll 


( STRUCTURAL PROTEIN f1 f I 


Compound 


VACCINIA VIRUS 
COMPLEMENT CONTROL 
PROTEIN; CHAIN: NULL; 


VACCINIA VIRUS 
COMPLEMENT CONTROL 
PROTEIN; CHAIN: NULL; 


LECTIN (AGC3LUTININ) 
WHEAT GERM 
AGGLUTININ aSOLECTIN 
2)9WGA3 


LECTIN (AGGLUTININ) 
WHEAT GHKM 


AGGLUTININ aSOLECTIN 
2)9WGA3 




PROTEIN BINDING 
PROFILIN I lACF 3 
PROTEIN BINDING, 
PROFILIN 1ACF4A 


PROTEIN BINDING | 
PROFILIN I lACF 3 
PROTEIN BINDING, 
PROFILIN 1ACF4A I 


PROFILIN; CHAIN: NULL; 


PROFILIN H; CHAIN: A, B. 
C.D; 


1 PROFILIN H; CHAIN: A, B; 


SEQFOLD 
score 














69.76 








PMF 
score 


-0.18 


0.01 


-0.17 


o\ 
d 




1.00 




1.00 


0.94 


1 1.00 


Verify 
score 


0.06 


9 


0.09 


0.09 




1.04 




0.91 


0.70 


1 0.75 


PsI 
Blast 




3.9e-13 


1.7C-14 


eg 

'S 




oo 
cn 

a> 
m 

00 


00 

cn 

00 ' 


(£> 


1 

cn 


L2e-39 




1450 


1623 


1252 


1284 




00 
CM 


05 

CS 


00 




oo 
(S 
»— 1 


START 
AA 


1330 


1524 


1072 


1123 






cn 


00 
»— « 


NO 


1— • 


CHAIN 
ID 


































< 




< 












< 


<: 




Iwc 


Ivvc 


OS 

r 

On 


9wga 




lacf 


lacf 1 


Icqa 


f-H 




SEQID 
NO: 


5 


5 










m 


in 


53 
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PDB annotation 


SEVEN-STRANUJiU 
INCOMPLETE 
ANTIPARALLEL UP- AND- 


DOWN BETA 2 BARREL, 
ACTIN-BINDING PROTEIN, * 
POLY-L-PROLINE BINDING 
3 PROTEIN. PIP2 BINDING d 
PROTEIN 


PROTEIN BINDUNU 
ACETYLATION. ACTIN- j 
BINDING PROTEIN. 
MULTIGENE FAMILY 


95 
§ pa 

li 




ACTIN-BINDING PKUTtSlN 
ACTIN-BINDING PROTEIN. 
PROFILIN, CYTOSKELETON 


.12; 

0 o a 

a< p:; Q 

Duj fit CO 

0 o o 
?2; ^ 

3Q R 
U U 

< < e 


CO 
000 

9 S 


TARGETING PROTEIN PIClfJJ 
GMPl. UBLl. SENTRIN; J| 
SUMO-1. POST- ]^ 
TRANSLATIONAL PROTEIl^ 
MODinCATION.2 Q 
UBIQUmN-LlKE PROTEINSpJ 
TARGETING PROTEIN \_ 


5 6 0 ( 

CO ID -< A, 


si 

li 

S Q 


Compound 




PROFILIN; CHAIN: NULL; 


PROFEJN; CHAIN: A, B; 


ACTIN BINDING PROTEIN 
PR0HLIN1PNE3 


PROFILIN; CHAIN: A. B; 


PROFILIN; CHAIN: A,B; 


PROFILIN I; CHAIN: NULL; | 


SUMO-1; CHAIN: NULL; 


1 

0 

^< 

it 
li 


1D8 UBIQUrriN; CHAIN: A; 


i 


score 1 












70.17 












score 




0.94 


0.99 


0.98 


1.00 




1.00 


0.00 


1.00 


1.00 


Verify 
score 




0.15 


0.66 


0.41 


0.54 

1 




0.72 


0.28 


1.02 


0.86 


1 


Blast 




f-H 
f-H 




o 

o 

OJ 


3.4e-38 


oo 
m 
1 

^. 

en 


< 


VO 

% 

00 
NO 


1.2e-23 


le-31 






B 




»r> 

»-H 


m 


05 
(N 






8 


S 


START 
AA 






NO 


VO 








ON 






CHAIN 
ID 






< 




< 


< 






< 


< 


PDB 


• 


Ifil 


s 


Ipne 


lypr 


lypr 


3nul 


•Q 


IbtO 




e. 

U3 


1 












9 


!? 
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PDB annotation 




SERINE PROTEASE 
INHIBITOR ALPHA-1- 
PROTEINASE INHIBITOR. 
ALPHA-l-ANTIPROTEINASE; 
SERINE PROTEASE . , 
INHmirOR, SERPIN, M 
GLYCOPROTEIN, SIGNAL, 2 V 
POLYMORPifflSM. 
EMPHYSEMA, DISEASE 
MUTATION. ACUTE PHASE 


SERINE PROTEASE 
INHIBITOR ALPHA- 1- 
PROTEINASE INHTOITOR, 
ALPHA-l-ANTBPROTElNASE; 
SERINE PROTEASE 
INHIBITOR, SERPIN. 
GLYCOPROTEIN, SIGNAL, 2 
POLYMORPHISM, 
EMPHYSEMA, DISEASE 
MUTATION, ACUTE PHASE 


SERPIN AACT SERPIN, 
SERINE PROTEINASE 


INHIBITOR, PARTIAL LOOP % 
2 INSERTION, LOOP-SHHBT n 
POLYMERIZATION. M 
EMPHYSEMA, DISEASE 3 * " - 
MtJTATION, ACUTE PHASE M 
PROTEIN, S 
CONFORMATIONAL 
DISEASE C 


lE^/ UK 




COMPLEX (TRANSCRIPTION^ 
FACTOR/DNA) COMlPLEX fl 
(TRANSCRIPTION R. 


Compound 




ALPHA-l-ANTITRYPSIN; 
CHAIN: A; 


ALPHA-l-ANTITRYPSIN; 

CHAIN: A: 




ALPHA-1- 

ANTICHYMOTRYPSIN; 


< 


PROTEINASE INHIBITOR 
ALPHAl 

ANTICHYMOTRYPSIN 
2ACH3 




i T PROTEIN; CHAIN: A, B; 
DNA: CHAIN! C. D: 




SEQFOLD 
score 




300.18 










178.20 


PL 


score 






1.00 


1.00 


0.06 






Verify 
score 






0.76 


0.72 


CO 

9 






Psi 
Blast 




o 


o 


o 


1 

CO 




6 

CO 
NO 


1 END i 


AA 




tn 


? 


3 


CO 

9 






START 
AA 






1-1 

so 




8 




oo 

CO 


1 CHAIN 1 


a 




< 


< 


< 






< 


ga 




iqlp 


Iqlp 


Iqmn 


2ach 




Ixbr 


1 SEOID 


NO: 










9 




o 
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PDB annotation 


FACTOR/DNA). 
TRANSCRIPTION FACTOR. 2 
DNA-BINDING PROTEIN 


COMPLEX (TRANSCRIPTION 
FACTOR/DNA) COMPLEX 
(TRANSCRIPTION 
FACTOR/DNA). IB 
TRANSCRIPTION FACTOR, " 
DNA-BINDING PROTEIN 




CELL ADHESION NEURAL 
CELL ADHESION 


CELL ADHESION NEURAL 
CELL ADHESION 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF. 
FGFR, IMMUNOGLOBULIN- 
LIKH. SIGNAL 
TRANSDUCnON. 2 
DIMEREZATION, GROWTH 
FACTOR/GROWTH FACTOR 


1 


IMMUNE SYSTEM CD32; 
RECEPTOR. FC. CD32. V 
IMMUNE SYSTEM H 


:l 


COMPLEX (NUCLEOCAPSID.^ 
PROTEIN/RNA) M 
NUCLEOCAPSID PROTEIN 
COMPLEX (NUCLEOCAPSng 
PROTEIN/RNA). 2 STEM- P 
LOOPRNA fU 


COMPLEX (NUCLEOCAPSID-%, 
PROTEIN/RNA) Q 
NUCLEOCAPSID PROTEIN^ 
COMPLEX (NUCLEOCAPSnC 
PROTEIN/RNA), 2 STEM- rU 
LOOPRNA nj 


s 


Compound 




T PROTEIN: CHAIN: A. B; 
DNA; CHAIN: CD; 




AX0NIN-1;CHAIN:A; 


AX0NIN-1;CHAIN:A; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A. B; 
FIBROBLAST GROWTH 
FACTOR RECHFIORI; 
CHAIN: C, D; 


FC GAMMA RIIB; CHAIN: 
A; 




NUCLEOCAPSID PROTEIN; 
CHAIN: A; SL3 STEM-. 
LOOP RNA; CHAIN: B; 


NUCLEOCAPSID PROTEIN; 
CHAIN: A; SL3 STEM- 
LOOP RNA; CHAIN: B; 


NUCLEOCAPSID PROTEB^ 


SEQFOLD 
score 
























PMF 
score 




1.00 




o 


en 


0.28 


OO 




0.27 




1-4 






d 


d 


d 






d 


Verify 
score 




0.68 




-0.71 


-0.31 


d 


-0.30 




0.07 


0,05 


1 0,28 


Psl 
Blast 




! ■ 

00 




S 


! . 


! 

NO 


5.2e^5 




3.4e-17 


o 


00 

vd 


is 




o 

CO 






o 


r-l 


00 
ON 




OO 
I-- 


1-4 

OO 


00 

r* 


START 
AA 














»-4 

CM 




B 






CHAIN 
ID 




< 






< 


<: 


Q 




< 




< 




< 




se 




Ixbr 




8 


NO 
»— » 


Icvs 


2fcb 




lalt 


lalt 


laaf 


SEQID 
NO: 




o 






in 




1—4 




«n 


CO 

m 


CO 
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III 






o o ^ 

a a ^ 
QQ 







o 






I CO ^ 



00 

o 




*8 



00 
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PDB annotation 




COMPLEX (MHCyVIRAL 


yHVi lDt/Ht^iJiih^l UK) HJLA 
A2 HEAVY CHAIN; 


COMPLEX (MHCAHRAL ^ 




1 
1 


IMMUNE SYSTEM 
IMMUNOGLOBULIN, 


IMMUNORECEPTOR, 
IMMUNE SYSTEM 


IMMUNE SYSTEM MHC I- 
AK; MHC I-AK; T-CELL 
RECEPTOR. MHC CLASS II. 
DIO. I-AK 


S"*^ 

1 


IMMUNE SYSTEM FAB-IBP 7 
COMPLEX CRYSTAL 
STRUCTURE 2.7A 
RESOLUTION BINDING 2 i". 
OUTSIDE THE ANTIGEN f- 
COMBINING SITE fl 
SUPERANTIGENFABVH3 3 X 


4 
>-• 


COMPLEX (HIV ENVELOPE 
PR0TEIN/CD4/FAB) M 
COMPLEX (HTV ENVELOPE f1| 


PROTEIN/CD4/FAB), HTV-l fU 
EXTERIOR 2 ENVELOPE 


Componnd 


CHAM: D;T CELL 
RECEPTOR BETA; CHAIN: 
E; 


HLA-A 0201; CHAIN: A; 
BETA-2 MICROGLOBULIN; 
CHAIN: B; TAX PEPTIDE; 
CHAIN; r;TCRI,L 


S 1 S w 


ALPHA-BETA T CELL 
RECEPTOR (TCR) (DIO); 
CHAIN: A; 


T-CELL RECEPTOR DIO 
(AUPHA CHAIN); CHAIN: 




IGMRF2A2;CHAIN:A,C 
E; IGM RF 2A2; CHAIN: B. 
D. F; IMMUNOGLOBULIN 
G BINDING PROTEIN A; 
CHAIN: a H; 


ENVELOPE PROTEIN 
GP120; CHAIN: G; CD4; 
CHAIN: C; ANTIBODY 17B; 


CHAIN: L,H; 


SEQFOLD 
score 




70.90 








50.38 


Oh 


score 1 






0.88 


1.00 


0.52 i 




Verify 
score 






0.13 


*-< 

o 


0.12 






Blast 1 




m 

i> . 


S 


m 
c> 
<^ 


o\ 
m 

o 

00 


•2 

00 
VD 












»n 




START 
AA 








r-4 


OS 




t 


a 




a 


< 


< 






ft. ^ 




f-H 


Ibwm 




Idee 




SEQ ID 
NO: 








-* 
m 
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jM. II K7,r. iTtt IP^ / 






PDB annotation 


GP120, T-CFJ J . SURl^ACb 
GLYCOPROTEIN CD4, 3 
ANTIGEN-BINDING 
FRAGMENT OF HUMAN 
IMMUNOGLOBULIN 17B. 4 
GLYCOSYLATED PROTEIN 


IMMUNOGLOBULIN INTACl A 
IMMUNOGLOBULIN V ^ 
REGION C REGION, 
IMMUNOGLOBULIN 


COMPLEX 

(ANTIBODY/ANTIGEN) 


CYTOKINE RECEPTOR. 
COMPLEX 

(ANTIBODY/ANTIGEN), 2 

TRANSMEMBRANE, 

GLYCOPROTEIN 


COMPLEX 

(IMMUNOGLOBULIN/RECEP 
TOR) TCR VAPLHA VBETA 
DOMAIN; T-CBLL 
RECEPTOR, STRAND 
SWITCH. FAB. 

ANTICLONOTYHC.2 1 
(IMMUNOGLOBULIN/RECEP f 
TOR) 




IMMUNE SYSTEM HUMAIN ? 
TCR/PEPTDDE/MHC 1 
COMPLEX, HLA-A2, HTLV-LI 
TAX, TCR, T 2 CELL | 


Compound 




IGG2A INTACT ANTIBODY 
- MAB231; CHAIN: A, B, C, 
D 


S o 
la 


RECEPTOR ALPHA CHAIN: 
CHAIN: I; 


KB5-C20 T-CELL ANTIGEN 
RECEPTOR; CHAIN: A. B; 
ANTIBODY DESIRE-1; 
CHAIN: L,H; 


• 


RECEPTOR; CHAIN: A. B; 
ANTffiODY DESIRE-1: 
CHAIN: L.H; 




MHCCLASSIHLA-A; 
CHAIN:A;BETA-2 
MICROGLOBULIN; CHAIN: 
B; TAX PEPTIDE P6A; 


SEQFOLD 
score 






50.28 


62.17 




71.85 


PMF 
score 




0.45 






r-l 

d 




Verify 
score 




0.01 






0.24 




Psl 
Blast 




3.4e-35 


3.4e-27 


1.7e-40 


! 


vo 

CO 






00 
NO 

<— 1 




1—1 






START 
AA 




OS 










CHAIN 
ID 




» 




< 


< 


Q 


PDB 
ID 






Ijrh 






Iqm 


a . 

CO 


1 








in 





263 



wo 02/081731 



PCT/US02/01222 




264 



wo 02/081731 



PCT/US02/01222 




265 



wo 02/081731 



PCTAJS02/01222 



PDB annotatton 






HYDROLASE MACHE; 
HYDROLASE, SERINE 
ESTERASE, 

ACETYLCHOLINESTERASE, 


TETRAMER, 2 HYDROLASE 
FOLD, GLYCOSYLATED 
PROTEIN 


HYDROLASE PNB 
ESTERASE; ALPHA^BETA 
HYDROLASE DIRECIHO % 
EVOLUTION r 


HYDROLASE ALPHA BETA J 
HYDROLASE FOLD, H 
PROLINE, PROLYL J 
AMINOPBPTIDASE. 2 H 
SMKRATIA, m 
IMINOPEPTIDASB O 






Compound 


(E.C.3.1.1.3) COMPLEXED 
WITH COLIPASE AND 
INHIBrrEDlLPB3BY 
UNDECANE 

PHOSPHONATE METHYL 
ESTER (TWO 

CONFORMATIONS) ILPB 4 


HYDROLASE LIPASE 
(E.C.3.1.1.3) 
(TRIACYLGLYCEROL 
LIPASE) COMPLECED 
WITH1LPP3 

HEXADECANESULFONAT 
E1IJPP41LPP71 


i ^ 
gi 


i 


r 

I 


PROLYL 

AMINOPBPTIDASE; 
CHAIN: A: 


i HYDROLASE(CARBOXYLI 
C ESTERASE) LIPASE 
(E.C.3.LL3) 

j TRL^CYLGLYCEROL 
HYDROLASE ITHG 3 


HYDI^OLASECCARBOXYLI 
C ESTERASE) LIPASE 
(E.C.3.1.1.3) 


SEQFOLD 
score 




64.96 


62.20 


66.32 






75.37 


PMF 
score 










0.40 


0.24 




Verify 
score 










-0.31 

i 


-0.37 

1 
1 




Psi 
Blast 




■* 
in 


00 

6 

00 

>o 


VO 


1.2e-05 


J? 

4> 

00 

IS 


»n 

i-H 








*n 

V) 


m 


B 




Ov . 


START 
AA 






1— < 


cn 


«n 

VO • 




l-H 


CHAIN 
ID 








< 


< 










llpp 


Imaa 


CP 
1-1 


Iqtr 




1 Ithg 


SEQID 
NO: 




VO 

in 


so 
«n 


VO 

»n 


VO 

m 




VP 

in 
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VO 
CO 





i 






&8 



od 



00 



« - 
> » 



CO 



k 



Si 



•5» 



13 
P. 
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A4 P 

E9 a 



CO 



i: 



o ^ 







3 S2 o Q 



CO fL, '-^ Ai 






CO i_ 



pa/' OX ' 





m ^ m K I; 



in 



ii 



Sg9 



e 



?5 



CO 

r 

ON 



CO 
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PDB annotation 


RECEPTOR 1 


CELL ADHESION NCAM; 
NCAM, IMMUNOGLOBULIN 
FOLD. GLYCOPROTEIN 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF2; 
FGFR2; IMMUNOGLOBULIN || 
aG)LIKE DOMAINS '\ 
BELONGING TO THE I-SET 2 
SUBGROUP WITHIN IG-LIKE 
DOMAINS. B-TREFOIL FOLD 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF2; 
FGFR2; IMMUNOGLOBULIN 
aG)LIKE DOMAINS 
BELONGING TO THE I-SET 2 
SUBGROUP WITHIN IG-LlKE 
DOMAINS. B-TREFOILFOLD 


IMMUNE SYSTEM FC- 
EPSELONRI- ALPHA; 
IMMUNOGLOBULIN FOLD. 
GLYCOPROTEIN. 
RECEPTOR, IGE-BINDING 2 
PROTEIN «ff 


IMMUNE SYSTEM HlCm f | 
AFPmiTY IGE-FC ' , 
RECEPTOR. FC(EPSILON) f 
IQE-FC; IMMUNOO-OBULIN " i 
FOLD. GLYCOPROTEIN, W 
RECEPTOR, IGE-BINDING 2 im 
PROTEIN, IGE AN TIBODY, ft 
IGE-FC Hi 


IMMUNE SYSTEM. 
MEMBRANE PROTEIN CD32; \ 
FCRECEFIOR, f' 
IMMUNOGLOULIN, 
LEUKOCYTE. CD32 |1 j 


IMMUNE SYSTEM CD32; fJ| 
RECEPTOR, FC.CD32, 


Compound 




NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B, 
C.D; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A. B, C. 
D; FIBROBLAST GROWTH 
FACTOR RECEPrOR 2; 
CHAIN: E,F, an; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B. C. 
D; FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; 
CHAIN: E.F,G,H; 


HIGH AFFINITY 
IMMUNOGLOBULIN 
EPSILON RRCRPTOR 
CHAIN: A; 


HIUHAFFINITV 
IMMUNOGLOBULIN 
EPSILON RECEPTOR 
CHAIN: A; IG EPSILON 
CHAIN C REGION; CHAIN: 
B,D; 


FC RECEPTOR 
FC(GAMMA)RIL\; CHAIN: 
A; 


FC GAMMA RUB; CHAIN: 


SEQFOLD 
score 


















PMF 
score 




0.92 

1 

1 


0.31 


0.65 


0.34 


0.27 


0.29 


0.71 


Verify 
score 




0.34 


0.12 


0.12 


0.04 


-0.14 


0.04 


0.33 ! 


Psi 
Blast 






l.le-15 


CO 

^. 

CO 


CO 
CO 




»o 

»— < 

00 


in 

CO 






i 




i 


r-l 


s 




M 


START 
AA 




CO 


oo 


CO 


<^ 
CO 


CO 


CO 


CO 


CHAIN 
ID 




< 


w 


a 


< 


<! 


< 


< 


Is 




lepf 




1 




ed 

<M 

f-H 


Ifcg 


2fcb 


SEQID 
NO: 




cn 

00 


CO 
00 


CO 
00 


CO 

oo 


CO 
00 


CO 
00 


CO 

oo 
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PDB annotation 


PROTEIN 


COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER. DNA-BINDING 
PROTEIN M 


COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER. DNA-BINDING 
PROTEIN 


COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER, DNA-BINDING 
PROTEIN 


COMPUBXCZINC 
FWGERmNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER, DNA-BINDING % 
PROTEIN ^ 


COMPLEX (ZINC , 1 
FINGER/DNA) COMPLEX jffl 
(ZINC FINGER/DNA). ZINC 
FINGER, DNA-BINDING {j 
PROTEIN ^ 


COMPLEX (ZINC 
FINGER/DNA) COMPLEX ' 
(ZINC FINGER/DNA), ZINC ^ 
FINGER, DNA-BINDING H 
PROTEIN n 


1 COMPLEX (ZINC • J^f 












PQ 










Compound 


BINDING SITE; CHAIN 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OUGONUCLEOTIDE 
BINDING SITE; CHAIN 
C; 


QGSR ZINC FING^ 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN 
C; 


QGSR ZmC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUaEOTIDE 
BINDING SITE; CHADS 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OUGONUCIEOTIDE 
BINDING SITE; CHAO^ 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OUGONUCLBOTIDE 
BINDING SITE; CHA» 
C; 


QGSR ZDSrC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIt 
C; 


1 QGSR ZINC FINGER 


SEQFOLD 
score 


















PMF 
score 




LOO 


-0.06 


0.53 


0.00 


0.19 


cs 
d 


1 0.63 


Verify 
score 




9 


0.10 


0.02 


-0.37 


-0.21 


? 

9 


I 0.11 


Psi 
Blast 




0) 
tT 

CO 


6 

00 


6.8e-28 


o\ 
cs 
o 
tn 

00* 


3.4e-25 


CO 


1 le-29 






i 


:8 

CO 


ro 


CO 

<o 


<n 
ao 


00 


o 

00 


START 
AA 




o 

00 


ON 


cn 
*— 1 
to 


♦—I 




ON 
CO 


ON 
GO 


























< 


< 


<: 


< 


< 




< 


< 






lalh 


lalh 


lalh 


lalh 


lalh 


lalh 


1 lalh 


SEQ ID 
NO: 




so - 

00 


00 


00 


NO 
00 


NO 
00 


^0 
00 


NO 
00 
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PDB annotation 


COMPLEX (ZINC 
HNGERyDNA) ZINC FINGiiR, 
PROTEIN-DNA 
INTERACTION. PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE. COMPLEX 
(ZINC FINGER/DNA) M 


COMPLEX (ZINC V 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN. 2 CRYSTAL 
STRUCTURE. COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PRQTEIN-DNA 
1 INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
' STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINCTR. 
PROTEIN-DNA ^ 
tNTHRACTION.PROTEm. 
DESIGN, 2 CRYSTAL \\ 
STRUCTURE, COMPLEX 
(ZINCFINGHR/DNA) "4 


FINGERA^NA) ZINC FlNGERg| 
PROTEIN-DNA Q 
INTERACTION, PROTEIN «i 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX ^ 
(ZINCFINGERA)NA) B 


J § s 


Compound 


DNA;CHAIN: A^B.D.E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C,F, G; 


DNA; CHAIN: A.B.D.E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C. F. G; 


DNA;CHAIN: A,B,D.E; 
CONSENSUS ZtNC FINGER 
PROTEIN; CHAIN: C, F, G; 




DNA; CHAIN: A^B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C. F, G; 


DNA; CHAIN:A,B,D.E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C. F. G; 


DNA;CHAIN:A,B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F. G; 


SEQFOLD 
score 














PMF 
score 


0.81 


1.00 


0.95 


0.03 1 


0.70 


0.35 


Verify 
score 


-0.13 


0.01 


-0.16 


-0.70 


-0.22 


-0.38 


Psl 
Blast 


1 

00 

vd 


NO 

1 

»n 

00 


L7e-43 


0 
CO 


IT? 


le-46 


B < 
^< 




i 


S3 


CO 


ON 
CO 


fO 
CO 


START 
AA 


in 


ON 




«U 

NO 


CO 


i 


CHAIN 
ID 




U 




a 


a 




PDB 
m 


e 


j 


r 


r 


Imey 


Imey 


1 


SEQID 


00 


CO 


CO 


NO 
00 


VO 

oo 


NO 
00 
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§ 

w 

CO 



£ 

o 



so 
CO 

d 



00 
CM 



NO 
00 



o 

c> 



CO 

9 



O 



-a ^ 



1 

od 



I 



On 



is 



00 



00 



00 



s 



VO 
VO 



00 
00 

1^ 



e 

CO 



VO 
00 



VO 
00 



I 



VO 
00 



VO 

00 
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PDB annotation 


FINGER/DNA) ZINC FINGER. 

PROTEIN-DNA 

IN TKRACTION, PROTEIN 

DESIGN, 2-GRYSTAL 

STRUCTURE, COMPLEX 

rZINCFINGliKyDNA) 


COMPLEX (ZINC ( 
FINGERTDNA) ZINC FINGER, \ 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
SIKUCTURE, COMPLEX 
(ZINCFINGER>a:>NA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN. 2 CRYSTAL 
STRUCTURE. COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 

IN'l'HRACnON, PROTEIN 1 \ 
DESIGN. 2 CRYSTAL F i 
STRUCTURE. COMPLEX 1 
(ZINC FINGER/DNA) T' 


ZINC FINGER . 
TRANSCRIPTION FACTOR h; 
SPl; ZINC FINGER. U\ 
TRANSCRIPTION f) 
ACTIVATION. SPl Rl 


COMPLEX (TRANSCRIPTIONS, 
REGULATION/DNA) TFlUA; ' 
5S GENE; NMR, I'FillA, 
PROTEIN. DNA, H 
TRANSCRIPTION FACTOR, fU 
5S RNA 2 GENE, DNA m 
BINDING PROTEIN, ZINC || 


Compound 


, CONSENSUS ZINC FINGER 
! PROTEIN; CHAIN: C, F, G 


DNA; CHAIN: A,B.D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C. F, G; 


DNA;CHAIN:A.B,D,E; 
CONSENSUS ZJNC FINGER 
PROTEIN; CHAIN: C, F. G; 


DNA;CHAIN:A.B.D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C. F. G; 


SP1F2; CHAIN: NULL; 


TRANSCRIPTION FACTOR 
mA; CHAIN: A;5SRNA 
GENE; CHAIN; E,F; 


1 SEQFOLD 
score 














PMF 
i score 




0.17 


0.65 


0.10 


0.03 


0.88 ! 


Verify 
score 
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— y^x 


r 




COMPLEX (TKAJNSUKUri lUiN 
REGULATION/DNA) YING- 
YANG 1; TRANSCRlFi'lON 
INITIATION. INIHATOR 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEINRECOGNrnON,3 
COMPLEX (TRANSCRIPTIOr]K 
REGULATION/DNA) ; \ 






§ 

S i 
1 

■ 


YANG 1; TRANSOUPTION 
INITIATION, INITIATOR 
FT.FMENT,YY1,ZINC2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 
INTTLVTION, INITIATOR 
RT.RMENT,YY1.ZINC2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTIO] 
1 REGULATION/DNA) 


REGULATION/DNA) YING- 
YANG 1; raANSCRlFl'iON 
miriATION.INrilATOR 


Compound 


ASSOCIATED VIRUS P5 
INTHATOR ELEMENT 
DNA; CHAIN: A,B; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 
ASSOCLVTHD VmUS P5 
INITIATOR FT EMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: QADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 






















^ a 








































PMF 
score 




0.07 


0.45 


0.83 


0.18 


Verify 
score 
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PDB aimotation 


BELONGING TO THE I-SET 2 
SUBGROUP WITHIN IG-LEKE 
DOMAINS, B-TREFOIL FOLD 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF2; 
FGFR2; IMMUNOGLOBULIN 
aG)LIKE DOMAINS d 
BELONGING TO THE I-SET 2 ^ 
SUBGROUP WITHIN IG-LDCE 
DOMAINS. B-TREFOIL FOLD 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGFl; 
FGFRl ; IMMUNOGLOBULIN 
aO) LIKE DOMAINS 
BELONGING TO THE I-SH'l' 2 
SUBGROUP WITHIN IG-LKE 
DOMAINS, B-TREFOIL FOLD 


IMMUNE SYSTEM FC- 
EPSILONRI- ALPHA; 
IMMUNOGLOBULIN FOLD, 
GLYCOPROTEIN, 
RECEPTOR, IGE-BINDING 2 
PROTEIN 


IMMUNE SYSTEM FC- «g 
EPSILONRI-ALPHA; „ 
IMMUNOGLOBULIN FOLD. ' ! 
GLYCOPROTEIN, Hi 
RECEPTOR. IGE-BINDING 2'"-! 
PROTEIN O 


IMMUNE SYSTEM HIGH (fj 
AFFINITY IGE-FC Q 
RECEPTOR, FC(EPSILON) 
IGE-FC; IMMUNOGLOBULIN^ 
FOLD, GLYCOPROTEIN, ^ 
RECEPTOR, IGE-BINDING 2 ^ 
PROTEIN, IGE ANTIBODY, M 
IGE-FC m 


.Si 

§ 


Compound 


CHAIN: E,F.G,H: 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A. B. C, 
D; FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; 
CHAIN: E.F,G.H; 


FIBROBLAST GROWTH 
FACTOR 1; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR I; 
CHAIN: CD; 


HIGH AFFINITY 
IMMUNOGLOBULIN 
EPSILON RECEPTOR 
CHAIN: A; 


HIGH AFFINITY 
IMMUNOGLOBULIN 
EPSILON RECEPTOR 
CHAIN: A; 


HIGHAFFINITY | 
IMMUNOGLOBULIN " 
EPSILON RECEPTOR I 
CHAIN: A; IG EPSILON 
CHAIN C REGION; CHAIN: \ 
B,D; 


ii 

si 

Si 


SEQFOLD 
score 
















PMF 
score 




0.09 


! 0.12 


0.71 


0.30 




0.13 


Verify 
score 
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1 


9 
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PDB annotation 


ALKYLATI0N,3 
PHOSPHORYLATION, 
CONTRACTILE PROTEIN c 


2 

O 

or 


PROTEIN, MYOSIN 
SUBFRAGMENT-l, MYOSIN 
HEAD, 2 MOTOR PROTEIN 


MUSCLE PROTEIN MUSCLE i 


PROTEIN, MYOSIN ' 
SUBFRAGMENT-1, MYOSIN • 
HEAD. 2 MOTOR PROTEIN 




KINASE KINASE. SIGNAL 

TRANSDUCTION. 

CALCIUM/CALMODULIN 


KINASE KINASE, SIGNAL 

TRANSDUCTION. 

CALCIUM/CALMODULIN 


TRANSFERASE 
TRANSFERASE, 
SBRlNEmmEONlNE- 


PROTEIN KINASE, CASEIN 
KINASE, 2 SERmm KINASE 


F-iU '"tf ' WSfr-O ' O JL E 


SSI 


Compound 




C 
< 

i 


r 

> 


c 

< 

i 
1 


■ 




1 CALCIUM/CALMODULIN- 
DH'ENDENT PROTEIN 
KINASE; CHAIN: NULL; 


CALCIUM/CALMODULIN. 
DEPENDENT PROTEIN 
KINASE: CHAIN: NULL; 


1 PROTEIN KINASE 1 


CK2/ALPHA-SUBUNIT; 




TRANSFERASECPHOSPHO 
TRANSFERASE) $C^AMP$. 
DEPENDENT PROTEIN 
KINASE (E.C.2.7.L37) 
($C/APK$) 1APM3 
(CATALYTIC SUBUNTT) 
ALPHA ISOENZYM6 
MUTANT WTTH SER 139 
1 APM 4 REPLACED BY 
ALA (/S139A$) COMPLEX 
WTTH THE PEPTEDE 1 APM 
5 rNHIBrrORPKI(5.24) 
AND THE DETERGENT 
MEGA-8 1APM6 


TRANSFERASECPHOSPHO 
TRANSFERASE) $C-/AMP$- 


SEOFOLn 1 




V 

M 

i 




392.48 






142.86 




110.78 




168.62 






1 






1.00 






! LOO 




1.00 




> 




1 






0.58 






0.39 




0.54 




Psi 
Blast 




o 


o 




00 




3.6e-56 


o 


o 






r- 


r 




00 

CO 






m 


NO 

cn 


START 
AA 






NO 




o 


rH 










B 




< 


< 










P3 








1 


1 




la06 


VO 

*s 


o 

"8 






lapzn 


SEQ ID 
NO; 




2 


8 

•-4 




o 


110- 


on 


110 


o 



319 



wo 02/081731 



PCT/US02/01222 



§ 

1 
o 



a 





S5 .aogp 



o 

S 



5^ 

Pi S 

S tvi £ w 

H cJ < U 






(I] 





3 



i/3 



8 



^3 



CO 



(13 



CO 



320 



wo 02/081731 PCTAJS02/01222 





N ^ £ n 



•a 

i 
t 

o 
U 






15 




s 



S 8 



go 



PQ 



321 



wo 02/081731 



PCT/US02/01222 









PDB annotation 


CELL DIVISION, MITOSIS, 
PHOSPHORYLATION 


TRANSFERASE JNK3; 
TRANSFERASE, JNK3 MAP 
KINASE, 


Z W 

i 

1 

CO Ah 


KINASE KINASE, TWITCHIN. J 
INTRASTERIC REGULATION ] 


KINASE KINASE, TWITCHIN, 
INTRASTHKIC REGULATION 


KINASE KINASE, TWITCHIN. 
INTRASTERIC REGULATION 


KINASE KINASE, TWITCHIN, 
INTRAS THUIC REGULATION 


TRANSFERASE MITOGEN 
ACTIVATED PROTEIN 
KINASE; TRANSFERASE, 
MAP KINASE, 
SERINEnrHREONINE. 
PROTEIN KINASE, 2 P38 


KINASE RABBIT MUSCLE 
PHOSPHORYLASE KINASE; 
GLYCOGEN METABOUSM, ] 
TRANSFERASE, f 
SERINE/THREONINE- p- 
PROTEIN, 2 KINASE, ATP- 
BINDING, CALMODULIN- ^ 
BINDING 5^ 


M < H g 

iii- 


S pj 5 
J o g g 

H § £ pQ (X) 


SERINE KINASE SERINE | 
KINASE, TITIN. MUSCLB, f 


Compound 




C-JUNN-TERMINAL 
KINASE; CHAIN: NULL; 


i 


TWITCHIN; CHAIN: A, B; 


m 

<r 

u 


TWITCHIN; CHAIN: A, B; 


MAP KINASE P38; CHAIN: 
NULL; 


PHOSPHORYLASE 
KINASE; CHAIN: NULL; 


PHOSPHORYLASE 
KINASE; CHAIN: NULL; 


< 


SEQFOLD 
score 




127.21 




139.53 






119.80 




170.32 


124.66 


PMF 
score 






1.00 




1.00 


1.00 




LOO 






Verify 
score 






0.48 




0.39 


0.36 




0.77 
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8- 










PDB annotadon 


EUKARYOTIC INITIATION 
FACTOR 4A;IF4 A, 
HELICASE. DEAD-BOX 
PROTEIN 


CHAPERONE/STRUCTURAL 
PROTEIN CHAPERONE 
ADHESIN DONOR S I'RAND 
COMPLEMENTATION. 2 
CHAPERONE/STRUCTURAL 
PROTEIN 


CHAPERONE/STRUCTURAL 
PROTEIN CHAPERONE 
ADHESIN DONOR STRAND 
COMPLEMENTATION, 2 
CHAPERONE/STRUCTURAL 
PROTEIN 




HYDROLASE HYDROLASE, 
DEPHOSPHORYLATION 


HYDROLASEPTPIB; 
HYDROLASE. 
PHOSPHORYLATION. 
LIGAND, INHIBITOR 


HYDROLASE C2 DOMAIN. 
PHOSPHOTIDYLINOSITOL. 
PHOSPHOTASE. 
HYDROLASE 


HYDROLASE PROTEIN- 
TYROSINE PHOSPHATASE; 
HYDROLASE, PROTEIN 
TYROSINE PHOSPHATASE. 
CATALYTIC DOMAIN, 2 
WPD LOOP. SH2 DOMAIN 


HYDROLASE DUAL 
SPECMaTY 
PHOSPHATASE. MAP 
KINASE HYDROLASE 


1 HYDROLASE DUAL 


Compound 


FACTOR 4A; CHAIN: A. B; 


PAPD-UKE CHAPERONE 
FIMC; CHAIN: A, C, E, G. I, 
K,M,O;MANN0SE- 
SPECIHC ADHESIN FIMH; 
CHAIN:B,D,F.H,J.L.N.P; 

1 
1 


PAPD-LKE CHAPERONE 
FIMC; CHAIN: A, C, E. G, I, 
K,M,0;MANNOSE- 
SPECIFIC ADHESIN FIMH; 
CHAIN: B. D. F, H, J, L. N. P; 




PROTECT TYROSINE 
PHOSPHATASE IB; 
CHAIN: NULL; 


PROTEIN-TYROSINE 
PHOSPHATASE IB; 
CHAIN: A; 


PHOSPHOINOSITIDE 
PHOSPHOTASE PTCN; 
CHAIN: A; 


SHP-1; CHAIN: NULL; 


PYSTl; CHAIN: NULL; 


1 PYSTl; CHAIN: NULL; 


SEQFOLD 
score 






















PMF 
score 




-0.20 


9 




0.03 


S 
o 


0.95 


0.01 1 

1 
1 
1 


0.71 


1 0.48 


Verify 
score 




0.14 


0,29 

1 




0.25 


0.20 


0.38 


0.18 


0.14 


1 -0.04 


PsI 
Blast 
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o 
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00 
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oo 
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1307 




m 
o\ 
M 


m 




1—1 
o 
<n 


O 

o\ 


\o 


START 
AA 




1227 


1241 




Ov 
oo 




<^ 
c» 
.—1 


o\ 

CO 


ov 
»o 


00 

f-H 


CHAIN 
ID 












< 


< 












1 Iqun 


Iqun 




•a 


00 

o 




1 


1 


1 Imkp 


p 






















3 










«-4 


*>-4 
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PDB annotation 


SPECMCITY 
PHOSPHATASE. MAP 
KINASE HYDROLASE 


RECEPTOR Dl; RECEPTOR. 
PHOSPHATASE. SIGNAL 
TRANSDUCTION, 
ADHESION, 2 HYDROLASE 


k 

IM 


HYDROLASE VHR; 
HYDROLASE, PROTEIN 
DUAL-SPECMCITY 
PHOSPHATASE 


HYDROLASE Dl; 

HYDROLASE, SIGNAL 

TRANSDUCTION. 

RECEPTOR, 

GLYCOPROTEIN. 2 

PHOSPHORYLATION. 

SIGNAL 


HYDROLASE YOP51. YUrZB. 
PASTEUPEI.TAX.PTP-ASE. 
PROTEIN TYROSINE f 
PHOSPHATASE. 
HYDROLASE ; 


HYDROLASE Y0P5I, YOF2B. . 
PASTEURELLAX.PTP-ASE. ^ 


PROTEIN TYROSINE 1, 
PHOSPHATASE. | 


c 
E 


SYP. SHPTP-2; TYROSINE . 
PHOSPHATASE, INSULIN ^ 
SIGNALING. SH2 PROTEIN ^ 

HYDROLASE SUMO f 
HYDROLASE. UBIOUTTIN- f 


Compound 




RECEPTOR PROTEIN 1 


TYROSINE PHOSPHATASE 
MU; CHAIN: A, B; 


^ i 

T H U 

ii 

ill 


HUMAN VHl-RELATED 
DUAL-SPECinCITY 
PHOSPHATASE CHAIN: A. 
B; 


RECEPTOR PROTEIN 
TYROSINE PHOSPHATASE 
ALPHA; CHAIN: A.B; 

1 


YERSINL\ PROTEIN 
TYROSINE 

PHOSPHATASE; CHAIN: 
NULL; 


YERSINU PROTEIN 
TYROSINE 

PHOSPHATASE; CHAIN: 


NULL; 


SHP-2;CHAIN:A.B; 

ULPl PROTEASE; CHAIN: 
A;UBITQUTIN-LIKE 


SEQFOLD 
score 


















PMF 
score 




0.09 




0.40 


0.09 


0.00 


0.39 


-0.12 
1.00 


Verity 
score 




-0.10 




0.38 


0.31 


-0.05 


-0.07 


d d 








1 

N 


vH 

<6 

00 

vd 


6.8e^6 


cs 

i 

NO 


B 

s> 

<> 
cn 


QO 

1 1 


END 
AA 




00 « 

o\ < 




o 
cn 


00 








START 
AA 




00 


CM 


O 

NO 
»-» 










CHAIN 
m 




< 


< 


< 


< 






< < 


PDB 
in 




Irpm 


Ivhr 


Ivhr 


lyfo 


lym 


lytn 


2shp 


e . 


5 




<— 4 






.—4 


t— « 
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PDB annotation 


ASSOCIATED 35 KDA 
PROTEIN. P35 A THREE 
HELK BUNDLE 


ENDOCYTOSIS/EXOCYTOSI 
S SYNAPTOTAGMIN 
ASSOaATED 35 KDA 
PROTEIN. P35A THREE 
HELIX BUNDLE , 


CONTRACTILE PROTEIN 
TRIPLE-HELDC COILED 
COBU CONTRACTILE 
PROTEIN 


TRANSCRIPTION 
REGULATION SIGMA70: 
RNA POLYMERASE SIGMA 
FACTOR. TRANSCRIPTION 
REGULATION 






IH'IC-ll-^"' 


METAL TRANSPORT ^ 
INHlBITORyRF.r:F.PTOR HFE; y 
HFE. HEREDITARY fl 
HEMOCHROMATOSIS, MHC fl 
CLASS I, TRANSFERRIN 2 v 
RECBFl'OR « 


HYDROLASE SOAP; ^ 
DOUBLE-ZINC r 
METALLOPROTEINAZfi, fl 
CALCIUM ACTIVATION. fi 
PROTEIN- 2 INHIBITOR f{ 


Compound 




SYNTAXm-lA; CHAIN: A. 
B.C; 


HUMAN SKELETAL 
MUSCLE ALPHA-ACTININ 
2; CHAIN: A; 


RNA POLYMERASE 
PRIMARY SIGMA 
FACTOR; CHAIN: NULL; 




HYDROLASE(AMINOPEPT 
IDASE) AMINOPKFriDASE 
(AEROMONAS 
PROTEOLYTICA) 
fE.C.3.4.U.10)lAMP3 


HYDROLASE(AMINOPEPT 
IDASE) AMINOPEPTIDASE 
(AEROMONAS 
PROTEOLYTICA) 
(E.C.3.4. 11.10) I AM1» 3 


HEMOCHROMATOSIS 
PROIHIN; CHAIN: A. D, G; 
BETA-2- 

MICROGLOBUUN; CHAIN: 
B, E, H; TRANSFERRIN 
RECEPTOR; CHAIN: C, F, I; 


AMINOPEPTIDASE; 
CHAIN: A; 


SEQFOLD 
score 




















PMF 
score 




-0.18 


-0.12 


0.06 




0.43 


0.43 


0.83 


0.64 


Verify 
score 




0.08 


0.02 


-0.24 




0.04 


0.11 


0.02 


-0.00 


PsI 
Blast 




00 


NO 


1.2e-09 




1.2e-18 


5.1e-30 


le-46 


oo 






CM 








vo 
to 
ro 


oo 


o\ 




START 
AA 




CO 
OS 


vo 
m 


»o 




<n 
0\ 






IIZ 


CHAIN 
ID 




< 


< 












< 






8 


Iquu 


.00 

'23 




lamp 


lamp 


1 




SEQID 
NO: 
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00 


00 








o 

CM 
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PDB annotation 


HALOPEROXIDASE 


4 

3 < 

1 

pq !l 


HALOPEROXIDASE, 
OXIDOREDUCTASE 


HALOPEROXIDASE 
CHLOROPEROXIDASE AI. ^ 
HALOPEROXIDASE Al ; 
HALOPEROXIDASE, 
OXIDOREDUCTASB 


HALOPEROXIDASE 
HALOPEROXIDASE F; 
HAT .nPFPOXTD 


OXIDOREDUCTASE, 
PROPIONATE COMPLEX 


AMINOPEPTIDASE 
AMINOPEPTIDASE, 
PROLINE IMINOPEPTIDASE. 
SERINE PROTEASE. 2 
XANTHOMONAS 
CAMPESTRIS 


HYDROLASE HYDROLASE, 
HALOALKANE 

DEHALOGENASE. 1 
ALPHA/BETA-HYDROLASE e 


HALOPEROXIDASE M 


HALOPEROXIDASE A2, 


< 

i£ 

1 c 


OXIDOREDUCTASE. t 
PEROXIDASR. ALPHA/BETA f 
2 HYDROLASE FOLD. p 
MUTANT M99T .'. 


HYDROLASE BPHD; 
HYDROLASE. PCB 
DEGRADATION 


HYDROLASE A/B f 
HYDROLASE FOLD, Fj 
DEHALOGENASE I-S BOND 


Compound 


CHLOROPEROXIDASE L; 


6* 

PQ 

<e 


BROMOPEROXIDASE Al; 
CHAIN: NULL; 


CHLOROPmiOXIDASE F; 


J 


PROLINE 

IMINOPEPTIDASE; CHAIN: 
A,B; 


HALOALKANE 
DEHALOGENASE; CHAIN: 
NULL; 


BROMOPEROXIDASE A2; 




2.HYDROXY-6-OXO-6- 

PHENYLHEXA-2.4- 

DIEN0ATBCHAIN:A; 


HALOALKANE 


DEHALOGENASE; 1- 
CHLOROHEXANE CHAIN: 


SEQFOLD 
score 


68.67 


54.47 


50.66 


56.93 


71.01 


68.50 


63.30 


73.30 




















Verify 
score 
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Blast 


L4c^6 


1.7e-40 


! 
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•'J 
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Icqw 
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PDB annotation 


HL33; SOS RIBOSOMAL 
PROTEIN L30P. HMAL30, 


RIBOSOMAL PROTEIN L31E, 
L34, HL30; SOS RIBOSOMAL 
PROTEIN L32E. HL5; SOS 
RIBOSOMAL PROTEIN L37E, 
L35E; SOS RIBOSOMAL 
PROTEINS L39E, HL39E. 1 
HL46E: SOS RIBOSOMAL 


PROTEIN L44E» LA, HLA; SOS 
RIBOSOMAL PROTEIN L6P, 
HMAL6. HLIO RIBOSOME 
ASSEMBLY, RNA^RNA, 
PROTEIN-RNA, PROTEIN- 
PROTEIN 




SERINE PROTEASE PCPA2; f] 
SERINE PROTEASE. ' 1 
ZYMOGEN. HYDROLASE 


SERINE PROTEASE J 
PORCINE m 
PROCARBOXYPEPTIDASE, m 
SERINE PROTEASE 0 


1 


t 


sis). 

m 


Compound 


m CHAIN: P; 
RIBOSOMAL PROTEIN 


RIBOSOMAL PROTEIN 
L24E; CHAIN: R; 
RTROSOMAL PROTEIN 


L29; CHAIN: S; 
RIBOSOMAL PROTEIN 
L30; CHAIN: T; 
RTBOSOMAL PROTEIN 


L3 IE; CHAIN: U; 
RIBOSOMAL PROTEIN 
L32E; CHAIN: V; 
RTBQSOMAL PROTEIN 


L37AE; CHAIN: W; 
RIBOSOMAL PROTEIN 
L37E; CHAIN: X; 
RIBOSOMAL PROTEIN 
L39E; CHAIN: Y; 
RIBOSOMAL PROTEIN 
L44E; CHAIN: Z; 
RIBOSOMAL PROTEIN L6; 
CHAIN: 1; 




PROCARBOXYPEPTIDASE 
A2; CHAIN: NULL; 


PROCARBOXYPEPTIDASE 
B; CHAIN: NULL 




HYDROLASE{C- 
TERMIN AL PEPTIDASE) 
PROCARBOXYPEPTIDASE 
AfE.C.3.4.12.2)lPCA3 




CENTROMERE PROTEIN 
B;CHAm:A; 


SEQFOLD 
score 






107.01 


110.63 1 


114.78 






PMF 
score 














0.83 


Verify 
score 














0.08 


Psl 
Blast 
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PDB annotation 


COMPLEX (ZINC 

FINGER/DNA) ZINC FINGER, 

PROTEIN-DNA 

INTERACTION. PROTEIN 
: DESIGN, 2 CRYSTAL 

STRUCTURE. COMPLEX 
^ (ZINC FINGER/DNA) 


COMPTHX(ZINC J 

FINGER>DNA) ZINC FINGER, ^ • 

PROTEIN-DNA 

IN THRACnON. PROTEIN 

DESIGN, 2 CRYSTAL 

STRUCTURE, COMPLEX 

(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGERiODNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGHR/DNA) ZINC FINGER, 
PROi'HlN-DNA 

INTERACTION, PROTEIN fi 
DESIGN, 2 CRYSTAL J 
STRUCTURE, COMPLEX . ^ 
(ZINC FINGER/DNA) J; 


ZINCFINGER '•j 
TRANSCRIPTION FACTOR (jl 
SPl; ZINC FMQER, t\ 
TRANSCRIPTION pj 
ACTIVATION. SPl 


ZINC FINGER * j 
TRANSCRIPTION FACTOR ^ 
SPl; ZINC FINGER. 
TRANSCRIPTION pj 
ACTIVATION. SPl fll 


1 COMPLEX (TRANSCRIPTIONHi 


Compound 


DNA;CHAIN:A.B,D.E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAM: CF.G; 


DNA; CHAIN: A, B, D.E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A,B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C. F, G; 


DNA;CHAIN:A.B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C,F,G; 


SP1F3; CHAIN: NULL; 


SP1F2; CHAIN: NULL; 


1 


SEQFOLD 
score 
















PMF 
score 


0.49 


1 ai6 


1 0.57 

1 
1 
i 


0.39 


0.43 


0.54 


1 0.07 


Verify 
score 


-0.13 


-0.06 


9- 


-0.14 


0.03 


-0.40 

1 

1 
1 


1 0.15 


Psl 
Blast 




1 


« 


3.4e-13 






1 5.1e-19 1 




n 




so 
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00 
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00 
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00 
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00 
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ID 
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is 
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Imey 
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1 


Ispl 


1 
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PDB annotation 


PROTEIN-DNA 
INTERACnON. PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE. COMPLEX 
CZ3NC FINGER/DNA) 


COMPLEX (ZINC 
FENGER/DNA) ZINC FINGER. 
PROTEIN-DNA i 
INTERACTION. PROTEIN ' 1 
DESIGN. 2 CRYSTAL 
STRUCTURE, COMPLEX 
rZINC FINGER/DNA) 


COMPLEX (ZINC . 
FINGER/DNA) ZINC PnvfUHR, 
PROTEIN-DNA 
INTHRACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPl^ 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINOER/DNA) ZINC FINGER. 
PROTEIN-DNA 
INTERACTION. PROTEIN 
DESIGN. 2 CRYSTAL V 
STRUCTURE. COMPLEX « 1 
(ZINC FINGER/DNA) ' 


COMPI£X(ZINC ;i 
FINGER/DNA) ZINC FINGER, J 
PROTEIN-DNA i 
INTERACTION. PROTEIN t 
DESIGN. 2 CRYSTAL 
STRUCTURE, COMPLEX fi 
(ZINC FINGER/DNA) X 


COMPLEX (ZINC ^ 
FINGER/DNA) ZINC FINGER, ' < 
PROTEIN-DNA M 
INTERACnON, PROTEIN f j 
DESIGN, 2 CRYSTAL Rj 
STRUCTURE. COMPLEX ^ j 


Compound 


PROTEIN; CHAIN: C, F, G; 


DNA;CHAIN: A,B,D.E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C. F. G; 


DNA; CHAIN; A, B, D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: CF.G; 


DNA:CHAIN:A,B.D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C. F, G; 


DNA;CHArN:A.B.D.E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA;CHAIN:A.B,D.E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAM: CF.G; 


SEQFOLD 
score 














PMF 
score 




0.89 


1.00 

1 


1.00 


1.00 . 


1.00 


Verify 
score 




0.19 


0.29 


0.30 


0.21 


0.37 
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PDB annotation 


INTERACl lUIV, fKU I ciiN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
rZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGRRiDNA) ZINC FINGER, 
PROTEIN-DNA 

INTERACHOIS[, PROTEIN | 
DESIGN, 2 CRYSTAL ^ 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


FINGERiT>NA) ZINC FINGER. 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE. COMlPLEX 
(ZINC FINGER/DNA) 


FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 

INTERACTION.PROTEIN | 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 1 
(ZINC FINGER/DNA) f 


FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA « 
INTERACTION, PROTEIN f( 
DESIGN, 2 CRYSTAL k 
STRUCTURE, COMPLEX jE 
(ZINC FINGER/DNA) f 


FINGER/DNA) ZINC FINGER, . 
PROTEIN-DNA ' 
INTERACTION, PROTEIN ^ 
DESIGN. 2 CRYSTAL ! 
STRUCTURE. COMPLEX | 
(ZINC FINGER/DNA) f 


Compound 




DNA; CHAIN: A,B,D,E; 
CONSENSUS ZINC FINGER 
PROTRTN: CH/VIN: C. R G; 


i 

\ ■ i 


.io 

III 


( 


..So 

g O P! 

D U 5 


1 

J 

4 


DNA; CHAIN: A, B, D, li; 
CONSENSUS ZINC FINGER 
PROTEIN: CHAIN: C, F, G; 




DNA; CHAIN: A,B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 




SEQ FOLD 
score 




100.48 . 










PMF 
score 






1.00 


1.00 


1.00 


0.09 


Verify 
score 






0.43 


0.61 


0.48 


O.U 
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.1 .-.^ mpg8.0ic 




PDB annotation 






COMPLEX (ZINC 
FINGERDNA) ZINC FINGEl 
PROTEIN-DNA 
INTERACTION. PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE. COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGEl 
PROTEIN-DNA 
MTERACTION. PROTEM 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX • 
(ZINC FINGER/DNA) 


COMPLEX (ZMC 

FMGEK/DNA) ZMC FMCBl 

PROTEM-DNA 

IN TERACTION. PROTEM 

DESIGN, 2 CRYSTAL 

STRUCTURE, COMPLEX 

(ZMCFMGER/DNA) 


COMPLEX (ZMC 
FMGER/DNA) ZMC FMGEl 
PROTEM-DNA 
MTERACTION, PROTEM 
DESIGN, 2 CRYSTAL 
SraUCTURE. COMPLEX 
(ZINCFINOliiyDNA) 


COMPLEX (ZINC 

FINGER/DNA) ZINC FINGB 

PROTEIN«'DNA 

IN rHRACTION, PROTEIN 

DESIGN. 2 CRYSTAL 

STRUCTURE. COMPLEX 

(ZlNCFINGERifDNA) 


Compound 


MUTANT WITH CYS 11 
IBBO 3 REPLACED BY 
ABU (CI 1 ABU) (NMR,60 


0 
PQ 

<^ 


DNA; CHAIN: A. B, D. E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAM: C. P. G; 


DNA;CHAIN:A.B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN: CHAIN: CF.G; 


DNA;CHAM:A.B,D,E; 
CONSENSUS ZMC FMGKR 
PROTEM; CHAM: C. F, G; 


DNA;CHAM:A,B,D,E; 
CONSENSUS ZMC FMGER 
PROTEM; CHAM: C,F,G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZmC FINGER 
PROiElN; CHAIN: C. F. G; 


SEQFOLD 
score 












• 


PMF 
score 




-0.20 


0.95 


1.00 


o 
o 


1.00 


Verify 
score 




0.01 


0.14 


0.48 


0.57 


0.79 


Psi 
Blast 




1.7e-30 


1.5e-42 


1 

CO 


1 


00 
fO 








CO 




CO 


1 


START 
AA 
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00 


CO 


CM 
CO 


s 


|s 
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PDB annotation 


INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 1 
REGULATION/DNA) YING- M 
YANG 1; TRANSCRIPTION 1 
INITIATION, miTL^TOR 
ELEMENT, YYl. ZINC 2 
FINGER PROTEIN. DNA- 
PROTEE^ RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YINO- 
YANG 1; TRANSCRIPTION 
INITIATION, INITL\TOR 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) . il 


COMPLEX (TRANSCRIPTIONjr i 
REGULATION/DNA) YINO- ' ( 
YANG 1 ; TRANSCRIPTION f ? 
rNITIATION. INITIATOR 
HJEMENT.YY1,ZINC2 M 
FINGER PROTEIN, DNA- ffl 
PROTEIN RECOGNITION. 3 
COMPLEX (TRANSCRIPTIONS 
REGULATION/DNA) If 


COMPLEX (TRANSCRIPTION j 
REGULATION/DNA) YING- CI 
YANG 1 ; TRANSCRIPTION h ^ 
INITL\TION, DSnriATOR fjl 
ELEMENT, YYl, ZINC 2 f| 
FINGER PROTEIN. DNA- 


Compound 


DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
BOTIATOR ELEMENT 
DNA; CHAIN: A, B; 


YY1;C3IAIN: C;ADENO- 
ASSOCIATHD VIRUS P5 
INTTL^TOR ELEMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: CADENO- 
ASSOdATMD VIRUS P5 
INmATOR Rf ,HMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C;ADENO- 
ASSOCL\TED VIRUS P5 
INITIATOR RT JRMENT 
DNA; CHAIN: A, B; 




















SEQFOI 
score 










t~- 
















CO 








PMF 
score 




1.00 


1.00 




1.00 . 


Verify 
score 




0.05 


0.55 
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PDB annotation 


DESIGN, 2 CRYSTAL 
SIRUCTURE, COMPUEX 
(ZINCFINGER/DNA) 


COMPLEX (ZINC 
FINGKR/DNA) ZINC FINGER, 
PROTEIN-DNA . 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINCFINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION. PROTEIN 
DESIGN. 2 CRYSTAL 
STRUCTURE, COMPUBX 
(ZINCFINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTOIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINCFINGER/DNA) °e 


COMPLEX (ZINC p 
FINGHR/DNA) ZINC FINOTRJ 
PROTEIN-DNA H 
INTERACTION. PROTEIN jy 
DESIGN, 2 CRYSTAL ^ 
STRUCTURE, COMPLEX 
CZINC FINGER/DNA) O 


COMPLEX (ZINC nj 
FINGER/DNA) ZINC FINGERV- 
PROTEIN-DNA 

INTERACTION. PROTEIN ^ 
DESIGN. 2 CRYSTAL 
STRUCTURE, COMPLEX fV 
(ZINCFINGER/DNA) flj 


COMPLEX (ZINC fill 


Compound 




DNA;CHAIN:A,B,D,E; 
CONSENSUS ZINC FINCMl 
PROTEIN; CHAIN: C, F, G; 


DNA;CHAIN:A,B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 




DNA;CHAIN:A,B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F. G; 


DNA;CHAIN:A,B,D.E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, P, G; 


DNA; CHAIN: A. B. D. E; 
CONSENSUS ZINC FDSfGER 
PROTEIN; CHAIN: C. K G; 




W 

d 

< 

i 


SEQFOLD 
score 
















PMF 
score 




1.00 


o 
q 


1.00 


1.00 


LOO 


1.00 


Verify 
score 




no 


0.25 


0.12 


0.03 


0.25 


0.24 
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Blast 
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PDB annotation 


1 INITIATION, INITL\TOR 1 


FTP.MENT,YYl,ZINC2 
FINGER PROTEIN, DNA- 
PRO TEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTIONB 
REGULATION/DNA) YING- ^ 
YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR 
ELEMENT, YYl. ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 
mmATION, INITIATOR 


RT,RMENT.YY1,ZINC2 
FINGER PROTEIN. DNA- 
PROTEIN RECOGNITION. 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) ^ 


COMPLEX (TRANSCRIPTIOIf, 
REGULATION/DNA) YING- ] 
YANG 1; TRANSCRIPTION Tl 
INrnATION, INITIATOR M 
ELEMENT, YYl, ZINC 2 C 
FINGER PROTEIN. DNA- ^ 
PROTEIN RECOGNITION. 3 Q 
COMPLEX (TRANSCRIPTIONj 
REOULATION/DNA) \ 


COMPLEX (DNA-BINDING =2 
PROTEIN/DNA) FrVE- M 
FINGER GLI; GLI. ZINC H 
FINGER. COMPLEX (DNA- fU 
BINDING PROTEtN/DNA) R| 


COMPLEX (DNA-BINDING m 1 


Compound 

. _ 


DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 
ASSOCLMEU VIRUS P5 
INTITATOR Rr.RMENT 
DNA; CHAIN: A,B; 


YYl; CHAIN: QADENO- 
ASSOCLM'ED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A. B; 


YYl; CHAIN: C;ADENO- 
ASSOQATED VIRUS P5 
INTTL^TOR ET EMENT 
DNA; CHAIN: A. B; 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: CD; 


ZINC FINGER PROTEIN 


SEQFOLD 
score 












92.67 


PMF 
score 




0.88 


0.84 


1.00 


0.80 




Verify 
score 




cn 

d 

I- 


-0.04 


0.02 


0.23 
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Blast 
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1 


I.7e-26 


1 7.2e-67 
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PDB annotation 


CRIJ. ADHESION LFA-1. 
ALPHA.L\3ETA-2 
INTEGRIN, A-DOMAIN; ILFA 




COMPLEX (TRANSCRIPTION 1 
FACTOR/DNA) Q 
TRANSCRIPTION FACTOR, ^ 
PROTEIN-DNA COMPLEX, 
CYTOKINE 2 ACTIVATION, 
COMPLEX (TRANSCRIPTION 
FACTOR/DNA) 




DNA-BINDING HMGA DNA- 
BINDING HMG-BOX 
DOMAIN A OF RAT HMGl; 
lAAB 8 HMG-BOX lAAB 20 


DNA-BINDING HMGA DNA- 
BINDING HMG-BOX 
DOMAIN A OF RAT HMGl; 
lAAB 8 HMG-BOX lAAB 20 


DNA BINDING PROTEIN 
HMO BOX, DNA BENDING, 
DNA RECOGNITION, »g 
CHROMATIN, NMR, DNA 2 1 
BINDING PROTEIN ^ • 


DNA BINDING PROTEIN 
HMG BOX. DNA BENDING, ' ^ 
DNA RECOGNITION, C 
CHROMATIN, NMR, DNA 2 
BINDINGPROTEIN 'f^ 




Compound 


CDll A; ILFA 5 CHAIN: A, 
B;1LFA6 




STAT3B;CHAIN:A;18- 
MER 

DESOXYOLIGONUCLBOTI 
DE; CHAIN: B; 




HIGH MOBXLITY GROUP 
PROTEIN; lAAB 5 CHAIN: 
NULL; lAAB 6 


HIGH MOBILITY GROUP 
PROTEIN; 1 AAB 5 CHAIN: 
NULL;1AAB6 


NON mSTONE PROTEIN 6 
A; CHAIN: A; 


NON mSTONE PROTEIN 6 
A; CHAIN: A; 


HIGH MOBn ,rrY group 1 

PROTEIN; CHAIN: A; DNA 
(5'-D(*CP*CP*(ID0) 
CHAIN: B; DNA (5'- CHAIN: 
C; 


SEQFOLD 
score 














54.43 






PMF 
score 


5 
d 




10*0 




1.00 


0.99 




1.00 


0.99 


Verify 
score 


0.10 




0.02 




0.32 


0.67 




0.52 


0.63 


Psi 
Blast 
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PDB annotation 


INITIATION, ZINC FINGER 
PROTEIN 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE m, 2 A 
TRANSCRIPTION ^ 
INITIATION, ZINC FINGER 


PROTEIN 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE m. 2 
TRANSCRIPTION 
INITIATION, ZINC FINGJiR 
PROTEIN 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE m, 2 
TRANSCRIPTION 
INinATION, ZINC FINGER fi^ 
PROTEIN * : 


COMPLEX-(TRANSCRIPTIONU 
REGULATION/DNA) YING- fl| 
YANG 1; TRANSCRIPTION C 
mHIATION, INITIATOR {/J 
ELEMENT, YYl, ZINC 2 Q 
FINGER PROTEIN, DNA- Fy 
PROTEIN RECOGNmON, 3 1,, 
COMPLEX (TRANSCRIPTIOM 
REGULATION/DNA) Q 


liei 

sill 


Compound 




TFIHA; CHAIN:A,D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 


TFmA; CHAIN: A,D; 5S 
RIBOSOMAL RNA CTNE; 
CHAIN: B,C,E,F; 


TFIHA; CHAIN: A, D; 5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C, E,F; 


YYl; CHAIN: QADENO- 
ASSOCIATED VIRUS P5 
INIIL\TOR ELEMENT 
DNA; aHAIN:A,B; 


YYl; CHAIN: C; ADENO- 
ASSOOATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A. B; 


SEQFOLD 
score 




108.72 










PMF 
score 






1.00 

i 


i 0.98 


0.94 


-0.14 


Verify 
score 






0.27 


0.03 


0.36 


0.16 


Psl 
Blast 
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PDB annotation 


FINGER, COMPLEX (DNA- 
BINDING PRO'l'HlN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GLI; GLI, ZINC 
FINGER, COMPLEX (DNA- - 
BINDING PROTEIN/DNA) W 


COMPLEX (DNA-BINDING ^ 
PROTEIN/DNA) FIVE- 
FINGER GLI; GLI, ZINC 
FINGER. COMPLEX (DNA- 
1 BINDING PROTHIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FTVE- 
HNGER GLI; GLI, ZINC 
FINGER. COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GU;GLL ZINC 
FINUKR, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 






IMMUNOGLOBULIN S 
IMMUNOGLOBULIN. FAB, ^ 
ANTIBODY, ANTI-E- 
SRI,F,CT1N Q 




Compound 




ZINC FINGER PROTEIN 
GLIl; CHAIN: A; DNA; 
CHAIN: CD; 


ZINC FINGER PROTEIN 
GLIl; CHAIN: A; DNA; 
CHAIN: CD; 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: CD; 


ZINC FINGER PROTEIN 
GLIl; CHAIN: A; DNA; 
CHAIN: CD; 


ELECTRON 

TRANSFER(1R0N-SULFUR 
PROTEIN) KUBREDOXIN 
6RXN3 




MONOCLONAL ANTI-E- 
SELECTIN7A9 
ANTIBODY; CHAIN: L. H; 


HUMAN 

IMMUNODEFICIENCY 
VIRUS TYPE 1 CAPSID 
CHAIN: A. B; ANTIBODY 
FAB25.3 FRAGMENT; 
CHAIN: H,K.L.M; 


SEQFOLD 
score 






101.31 










50.83 




PMF 
score 




0.48 




1.00 


0.72 


0.96 






0.98 


Veriftr 
score 




0.19 




0.32 


0.12 


0.63 






■* 
9 










! 




VO 
cn 




00 


00 


PQ 




6 


•-4 






0.00 




6 
<o 








CO 




i 


00 


CO 




VO 

c5 




START 
AA 










Si 


VO 
CO 








CHAIN 
ID 




< 


< 


< 


< 


















2gli 


2gli 


B 

1 






lafv 
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NO: 
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& 
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5 
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PDB annotation 






SIGNALING PROTEIN GTP- 
BINDING PROTEINS, _ 
PROTEIN-PROTEIN ffl 
COMPLEX, EFFECTORS ^ 


SIGNALING PROTEIN GTP- 
BINDING PROTEINS. 
PROTEIN-PROTEIN 
COMPLEX, EFFECTORS 


SIGNALING PROTEIN GTP- 
BINDING PROTEIN RHOA, 
GTPASE RHOA; RHO GDI 1; 
RHO GTPASE, G-PROTEIN. 
SIGNALING PROTEIN 


SIGNALING PROTEIN O 
PROTEIN, GTP 

HYDROLYSIS, KINETIC 'Tp 
CRYSTALLOGRAPHY. 2 
SIGNALING PROTEIN _i 


SIGNALING PROTEIN G .jA 
PROTEIN, GTP JU 
HYDROLYSIS, KINETIC ^ 
CRYSTALLOGRAPHY. 2 M* 
SIGNALING PROTEIN Q 


SIGNALING PROTEIN fU 
PROTEIN-PROTEIN \ 
COMPLEX, ANTBPARALLEIf^ 
COILED-COIL T: 


ENDOCYTOSIS/EXOCYTOSTj 
S G-PROTEIN, GTPASE, tU 
RAB6, VESICULAR RJ 
TRAFFICKING lU 


Compound 


andlyso2:yme 

(E.C.3.2.1.17)3HFM4 
COMPLEX 3HFM 5 




RAS-RELATED PROTEIN 
RAP-IA; CHAIN: A; 
PROTO-ONKOGENE 
SERINE/THREONINE 
PROTEIN KINASE CHAIN: 
B; 


or 
< 


RAP-1A;CHAIN:A; 
PROTO-ONKOGENE 


SERINEmiREONINE 
PROTEIN KINASE CHAIN: 
B: 


TRANSFORMINO PROTEIN 
RHOA; CHAIN: A, C; RHO 
GDPDISSOCLVnON 
INHIBITOR ALPHA; 
CHAIN: E.F: 


TRANSFORMING PROTEIN 
P21/H-RAS-1; CHAIN: A; 


TRANSFORMING PROTEIN 
P21/H-RAS-1:CHAIN: A; 


& 

< 


TRANSFORMING PRO TlillN 
RHOA(0-181):CHAIN:A; 
PKN: CHAIN: B: 


RAB6 GTPASE; CHAIN: A; 


SEQFOLD 
score 






57.43 






51.73 








PMF 


score 








o 


0.00 




0.36 


0.16 


0.25 


Verify 
score 








0.28 


0.04 




0.34 


0.34 


0.30 




Blast 






1.2e-63 


1.2e-63 


% 

»— 1 


lis-65 


le-65 


in 












m 
vo 
1-1 






« 


vo 


vo 


START 
AA 














PI 




cn 


CHAIN 
ID 






< 


< 


< 


< 


"< 




< 








Icly 


icly 


IccO 


Ictq 


Ictq 


Icxz 




1 


NO: 






3 

m 


g 

m 


3 

cn 


3 


8 

CO 


s 


o\ 
o 
cn 
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PDB annotation 


ENDOCYTOSIS/BXUCY iUOl 
S G PROTEIN, VESICULAR 
TRAFFIC, GTP 
HYDROLYSIS, YPT/RAB 2 
PROTEIN, ENDOCYTOSIS, 
HYDROLASE ^, 


111 


£ S P 

0 o o 

1 ^ 53 /N* 


GTP-BINDING GTP- 
BINDING. GTPASE. SMALL 
G-PROTEIN. RHO FAMILY, 
RAS SUPER 2 FAMILY 


COMPLEX (QTP- 
BINDING/EFFECTOR) RAS- 
RELATED PROTEIN RAB3A; 
COMPLEX (GTP- 
RTNfnmn/EFFECTORI. G 


PROTEIN, EFFECTOR, V 
RABCDR, 2 SYNAPTIC f\ 
EXOCYTOSIS.RAB ^ 
PROTEIN. RAB3A, ?J 
RABPHILIN M 


HYDROLASE GFROimiN. 
VESICULAR TRAFFICKING,^ 
GTP HYDROLYSIS. RAB 2 p 
PROTEIN. fU 
NEUROTRANSMl'llER 
RELEASE. HYDROLASE 

Mr 


COMPLEX (MHCAORAL * . 
PEPTIDE/RECEPTOR) HLA- 8 W 
A2 HEAVY CHAIN; CLASS irXl 
MHC.T-CELL RECEPTOR, fU 


Compound 


GTP-BINDING PROTEIN 
YPT51; CHAIN: A; 


RAP2A; CHAIN: NULL; 


RAP2A; CHAIN: NULL; 


RACl; CHAIN: NULL; 1 




RAB-3A; CHAIN: A; 
RABPHILIN-3A; CHAIN: B; 


< 

J 
< 

CO 


HLA-A 0201; CHAIN: A; 
BETA-2 MICROGLOBULIN; 
CHAIN: B; TAX PEKl'IUE; 
CHAIN: C; T CELL 


SEQFOLD 
score 




68.65 










73.76 


PMF 
score 


0.29 




0.72 


-0.01 


0.34 


0.36 




Verify 
score 


0.41 




0.14 


0.10 


0.05 


-0.08 




PsI 
Blast 


rn 


6.8e-60 


S 

00 


m 

A 

CO 


f— • 
VO 

CO 

vd 


1.7e-61 








3 


8 


vo 
vo 




S 


00 

f— < 


START 
AA 


CM 






CO 




CO 


CO 
CM 


CHAIN 
ID 


< 








< 


< 


Q 


PDB 

m 


lekO 




Ikao 


Ikao 


Imhl 


Izbd 


CO 


1 


a. 


i 


S 
cn 


3 

CO 


3 


8 

CO 


8 

CO 


o 
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PDB annotation 


IMMUNOGLOBULIN 
FRAGMENT, BENCE-JONES 
2 PROTEIN. IMMUNE 
SYSTEM 


IMMUNE SYSTEM FAB-IBP 
COMPLEX CRYSTAL 
STRUCTURE 2.7A M 
RESOLUTION BINDING 2 ^ 
OUTSIDE THE ANTIGEN 
COMBINING SITE 
SUPERANTIGEN FAB VH3 3 
SPECIFICITY 




IMMUNE SYSTEM 
IMMUNOGLOBULIN FOLD, 


15 
D 


PC 






i 

1 


Compound 




IGM RF 2A2; CHAIN: A. C, 
! E; IGM RF 2A2; CHAIN: B. 
D. F; IMMUNOGLOBULIN 
G BINDING PROTEIN A; 
CHAIN: G.H; 


IMMUNOGLOBULIN 3D6 
FAB IDFB 3 


IGMMEZ 

IMMUNOGLOBULIN; ' 


CHAIN: L; lUM MHZ 
IMMUNOGLOBULIN; 
CHAIN: H; 


IMMUNOGLOBULIN FV 
FRAGMENTOFA 
HUMANI/KI) VERSION OF 
THE ANTI-CDIS IFGV 3 
ANTIBODY 1152' (HUH52- 
AAFV)1FGV4 


IMMLINOGLOBULIN FV 
FRAGMENTOFA 
HUMANIZED VERSION OF 


THE ANTI-CDIS IFGV 3 
ANTIBODY H52' (HUH52- 
AAFV)1FGV4 


IMMUNOGLOBULIN 
IMMUNOCaX)BULIN M 
aG-M)FV FRAGMENT 
lIGM 3 


Hi 

P 5 s 
§ § g 


SEQ FOLD 
score 










55.38 




52.10 




PMF 1 


score 1 




0.98 


1.00 


0.99 




0.99 




1.00 


Verify 
score 




0.12 


0.43 


0.66 




0.37 




0.42 


1 


Blast 1 






NO 

1 
cn 
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»— t 
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cn 
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Ifgv 
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PDB annotation 


INTERACTION, PROTEIN 
DESIGN. 2 CRYSTAL 
STRUCTURE. COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
1 FINGER/DNA) ZINC FINGHK,. 
1 PROTEIN-DNA i 

IN TERACnON. PROTEIN 

DESIGN, 2 CRYSTAL 

STRUCTURE, COMPLEX 

(ZINC FINGER/DNA) 


COMPLEX (ZINC 
1 HNGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEm 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE. COMPLEX 1 
(ZINCFINGERADNA) H 


COMPLEX (ZINC J 
FINGER/DNA) ZINC FINGER, ,1 
PROTEIN-DNA ; 
INTHIACTION, PROTEIN ^ 
DESIGN, 2 CRYSTAL U 
STRUCTURE, COMPLEX C 
(ZINCFmUHH/DNA) fl 


COMPLEX (TRANSCRIPTION'y 
REGULATION/DNA) ^ 
COMPLEX (TRANSCRIPTION^ 
REGULATION/DNA), RNA «^ 
POLYMERASE m, 2 l1 
TRANSCRIPTION fl 
INITIATION. ZINC FINGER ^ 




Componnd 




DNA;CHA1N:A,B,D.E; 
CONSENSUS ZINC FINGER 
PROTEIN: CHAIN: C, F, G; 


DNA;CHAIN:A,B,D.E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A.B, D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C. F, G; 


DNA;CHAIN:A.B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN: CHAIN: C,F,G; 






SEQFOLD 
score 






93.81 








Tables 


PMF 
score 




1.00 




1.00 


0.95 


0.99 




Veriftr 
score 




0.50 




0.45 


0.01 


d 
1 




Psi 
Blast 




lc-50 


le-50 


L4e-47 


1.7e-12 


cn 

1-4 








? 


m 




00 


1 

CS 




START 
AA 




m 
cs 








m 

1-4 




CHAIN 
ID 




U 


u 


o 


o 


< 








Imey 


Imey 


Imey 


Imey 


ltf6 




SEQ ID 
NO: 




cn 


LK 


r- 
tn 
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PDB annotation 


PROTEIN 1 


COMPLEX (TOANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE 01,2 
TRANSCRIPTION | 
INITIATION, ZINC FINGER f 
PROTEIN 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE in, 2 
TRANSCRIPTION 
INITIATION, ZINC FINGER 
PROTEIN 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YINO- 
YANG I; TRANSCRIPTION 
INmATION. INITIATOR 
ELEMENT. YYl. ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPnONe 
REGULATION/DNA) ^ 


COMPLEX (TRANSCRIPTION;! 
REGULATION/DNA) YING- J 
YANG 1; TRANSCRIPTION V 
mrriATION, INITIATOR i 
ELEMENT. YYl, ZINC 2 £ 
FINGER PROTEIN, DNA- f 
PROTEIN RECOGNITION. 3 i. 
COMPLEX (TRANSCRIPTION, 
REGULATION/DNA) ^ 


COMPLEX (TRANSCRIPTIONS 
REGULATION/DNA) YING- f 
YANG 1 ; TRANSCRIPTION f 
INITIATION. INITIATOR f 




Compound 




TFinA;CHAIN: A.D:5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C.E.F; 


TFinA;CHAIN:A,D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E.F; 


YYl; CHAIN: C; ADENO- 
ASSOCL\TED VIRUS P5 
INITIATOR F,I .RMENT 
DNA: CHAIN: A, B; 


YYl; CHAIN: C;ADENO- 
ASSOOATED VIRUS P5 
INITIATOR HI ,BMHNT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C;ADENO- ' 
ASSOCIATED VIRUS P5 
mrriATOR ELEMENT 
DNA; CHAIN: A. B; 




SEQFOLD 
score 




99.41 










Tables 


PMF 
score 






0.86 


1.00 


1.00 


1.00 1 




Verify 
score 






0.03 


-0.01 


0.32 


0.23 




Psi 
Blast 




OO 

cn 
cn 


00 

cn 
cn 




9.9e-50 


cn 




i< 




o 
r- 
cn 


00 


m 
o> 




CM 




START 
AA 






00 


t~ 
ts 


oo 

12 






CHAIN 
ID 




< 


< 


U 


U 


U 








life 




lubd 


lubd 


lubd 




SEQ 10 
NO: 




cn 


m 


m 


«—< 
m 


cn 
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PDB annotation 


BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) HVE- 
FINUHKGU;GLI,ZINC 
FINGER. COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING ' 
PROTEIN/DNA) FIVE- 
FINGER GLI; GLI, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA^ 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GU; GLI, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GU; GLI. ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER (3U; GLI. ZINC 
FINGER. COMPLEX (DNA- I 
BINDING PROTEIN/DNA) , 






BLOOD COAGULATION j 
BLOOD COAGULATION. ■ 
EOF. HYDROLASE. SERINE | 
PROTEASE i 


SURFACE PROTEIN | 
MEROZOITE SURFACE f 


Compound 




ZINC FINGER PROITEIN 
GUI; CHAIN: A; DNA: 
CHAIN: CD; 


ZINC FINGER PROTEIN 
GLIl; CHAIN: A; DNA; 
CHAIN: CD; 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: CD; 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: CD; 


ZINC FINGER PROTEIN 
GLIl; CHAIN: A; DNA; 
CHAIN: CD; 




COAGULATION FACTOR 
EGF-LIKE MODULE OF 
BLOOD COAGULATION 
FACTOR X (N-TERMINAL. 
lAPO 3 APO FORM) (NMR, 


13 STRUCTURES) lAPO 4 


FACTOR VH; CHAIN: 
NULL; 


MEROZOITE SURFACE 
PROTEIN 1;C3IAIN: A; 


SEQFOLD 
score 






8h75 
















PMF 
score 




0.47 




0.87 


0.86 


-0.12 




1.00 


1,00 


cn 

9 


Verify 
score 




0.03 




0.23 


-0.16 


0.21 




0.72 

i 


S . 
o 


0.15 


Psi 
Blast 




1.7C-34 


J? 


Ti- 
en 

6 

00 

\6 


o 

6 


1.7e-30 




6.60-12 


1.3e-ll 


s 

6 
cn 

CM 








00 

cn 


cn 
*^ 
cn 


o\ 
a\ 
cn 


CM 




ON 

«n 




00 


START 
AA 




oo 


3 

CM 




cn 
r- 

CM 






m 
«s 






CHAIN 
ID 




< 


< 


< 


< 


< 








< 


la 








2gli 


2gli 


2gli . 




lapo 




•5? 
o 


SEQ ID 
NO: 




1— • 
m 


cn 


r>- 
1—1 

CO 


cn 


m 






cn 


cn 
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PDB annotation 


MEMBRANE PROTEIN C- 

TYPELECnN-UliE 

DOMAINS 


MEMBRANE PROTEIN C- 
TYPELECTIN-UKE 
DOMAINS i 


HEMATOPOIETIC CELL 
RECEPTOR ACTIVATION 
INDUCER MOLECULE (AIM), 
EA 1. HEMATOPOIETIC 
rFT.L RECEPTOR, 
LEUCOCYTE, C-TYPE 
LECTIN-LIKE, 2 NKD, KLR 


SUGAR BINDING PROTEIN 
C-TYPE LECTIN, MANNOSE 
RECEPTOR 


COAGULATION FACTOR 
BINDING IX/X-BP 
COAGULATION FACTOR 
BINDING, C-TYPE LECTIN, 
GLA-DOMAIN 2 BINDING, C- 
TYPE CRD MOTIF, LOOP * 


1 EXCHANGED DIMER A 


COAGULATION FACTOR ^ 
BINDING DC/X-BP 
COAGULATION FACTOR 
BINDING, C-TYPE LECTIN, ^ 
GLA-DOMAIN 2 BINDING, C- 1 
TYPE CRD MOTIF, LOOP { 
EXCHANGED DIMER fj 


COAGULATION FACTOR 
BINDING IXOC-BP s 
COAGULATION FACTOR J 
BINDING, C-TYPE LECTIN, I 
GLA-DOMAIN 2 BINDING, C- 1 
TYPE CRD MOTIF, LOOP f 
EXCHANGED DIMER f 


Compound 


FLAVOCETIN-A: ALPHA 
SUBUNTT; CHAIN: A; 
FLAVOCETIN-A: BETA 
SUBUNTT; CHAIN: B 


FLAVOCETIN-A: ALPHA 
SUBUNTT; CHAIN: A; 
FLAVOCETlN-A: BETA 
SUBUNTT; CHAIN: B 


EARLY ACTIVATION 
ANTIGEN CD69; CHAIN: A; 


MACROPEIAGE MANNOSE 
RECEPTOR; CHAIN: A, B; 


COAGULATION FACTORS 
IXOC-BINDING PROTEIN; 
CHAIN:A,B,C, D,E.F; 


COAGULATION FACTORS 
DOT-BINDING PROTEIN; 
CHAIN:A.B,C,D,E,F; 


COAGULATION FACTORS 
tX/X-BINDING PROTEIN; 
CHAIN:A,B,C,D,E,F; 


SEQFOLD 
score 










59,78 




70.03 


PMF 
score 


0.81 


0.98 


1.00 


0.75 




1.00 




Verify 
score 


0.61 


0.36 


0.91 

1 
1 


0.71 




0.38 
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Blast 
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CO 
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2 


lc3a 


oo 
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lixx 
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PDB annotation 


FACTOR/GROWTH FACTOR | 


i 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF, 
FGFR, IMMUNOGLOBULIN- 


LIKE, SIGNAL i 
TRANSDUCTION, 2 i 
DIMERIZATION, GROWTH " 
FACTOR/GROWTH FACTOR 


i 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF, 
FGFR, IMMUNOGLOBULIN- 
LIKE. SIGNAL 
TRANSDUCTION, 2 
DIMERIZATION, GROWTH 
FACTOR/GROWTH FACTOR 




GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FOTl ; 
FGFRl; IMMUNOGLOBULIN 
aO) LIKE DOMAINS 
BELONGING TO THE I-SET 2 
SUBGROUP WITHIN IG-UKE'i ; 
DOMAINS. B-TREFOIL FOLD f '■ 


IMMUNE SYSTEM, Ji 
MEMBRANE PROTEIN CD32; ^ 
FC RECEPTOR, J 
IMMUNOGLOUUN, 
LEUKOCYTE. CD32 t: 


CONTRACTILE PROTEIN %] 
IMMUNOGLOBULIN FOLD, f i 
BETA BARREL % ' 


MUSCLE PROTEIN «^ 
CONNECTIN. NEXTM5; * ; 
OFT J, ADHESION, *" 
GLYCOPROTEIN. fi 
TRANSMEMBRANE, Pj 
REPEAT. BRAIN. 2 Rl 


Compound 




FIBROBLAST GROWTH 
FACTOR 2; OHAIN: A. B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1 ; 
CHAIN: CD; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: CD; 


FIBROBLAST GROWTH 
FACTOR 1; CHAIN; A, B; 
FIBROBLAST GROWTH 
FACTOR RRCRPTORl; 
CHAIN: CD; 


FC RECEPTOR 
FC(GAMMA)RIIA; CHAIN: 
A; 


TELOKIN;CHAIN:A | 


TITIN; CHAIN: NULL; 


SEQFOLD 
score 






















00 
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o\ 

CO 




00 


0.24 


0.64 


0.24 








d 






d 




d 


Verify 
score 
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ON 
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0.17 
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PDB annotation 






COMPLEX (ZINC 
FINGEK/DNA) ZINC FINCffiR. 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINCFINGERyDNA) 


COMPLEX (ZINC 
FINGHR/DNA) ZINC FINGER. 
PROTEIN-DNA 
INTERACTION. PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINCFINGERyDNA) =f 


COMPLEX (ZINC (4 
FINGER/DNA) ZINC FINGER,lJ 
PROTEIN-DNA ^ 
INTERACTION. PROTEIN J 
DESIGN. 2 CRYSTAL .G 
STRUCTURE, COMPLEX m 
(ZINC FINGER/DNA) fi 


COMPLEX (ZINC R 
FINGER/DNA) ZINC FINGER A 
PROTEIN-DNA « 
INTERACTION, PROTEIN 5« 
DESIGN, 2 CRYSTAL H 
STRUCTURE, COMPLEX fl 
(ZINC FINGER/DNA) R 
COMPLEX (TRANSCRIPTIONq' 


Compound 


IBBO 3 REPLACED BY 
ABU (CI 1 ABU) (NMR, 60 


«t 

0 
25 

3Q 

1-4 

D 


DNA-BINDING PROTEIN 


i 
1 1 


MUTANT WITH CYS 11 
IBBO 3 REPLACED BY 
ABU (CllABU) (NMR. 60 


0 
n 

pq 

I—I 


DNA;CHAIN:A,B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C F, G; 

1 


DNA:CHAIN: A,B.D,E; 
CONSENSUS ZINC FINC^R 
PROTEIN; CHAIN: C. F, G; 


DNA;CHAIN:A.B.D.E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C. F. G; 


DNA;CHAIN: A,B,D,E; 1 
CONSHNSUS ZlNr FINGER 


PROTEIN; CHAIN: C. F, G; 
YYl; CHAIN: C;ADENO- 


SEQFOLD 
score 














PMF 
score 




0.10 


1.00 


0.66 


0.77 


0.88 


0.94 


Verify 
score 




-0.78 


0.36 


0.20 


-0.37 


0.39 


0.04 
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PDB annotation 


REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 

mrriATiON, initiator 

ELEMENT. YYl. ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNmON, 3 , 
COMPLEX (TRANSCRIPTION i 
REGULAHON/DNA) " 


COMPLEXXIRANSCRIPTION 
REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 
INTTL^TION, INITIATOR 
ELEMENT. YYl. ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNmON. 3 
1 COMPLEX (TRANSCRIPTION 


1 REGULATION/DNA) 1 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 
INITIATION. INITIATOR 
ELEMENT. YYl. ZINC 2 
FINGER PROTEIN. DNA- 
PROTEIN RECOGNITION, 3 V 
COMPLEX (TRANSCRIPTIONf > 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION,'; 
REGULATION/DNA) YING- '] 
YANG 1; TRANSCRIPTION CI 
INITIATION, INTTLVTOR 1 1 
ELEMENT, YYl. ZINC 2 %) 
FINGER PROTEIN, DNA- f| 
PROTEIN RECOGNITION, 3 \ 
COMPLEX (TRANSCRIPTION . \ 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTIONIf 
REGULATION/DNA) YING- rf) 
YANG 1; TRANSCRIPTION f| 
INmATION, INmATOR 


Compound 


ASSOCIATED VIRUS P5 
INTTL^TOR ELEMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C;ADENO- 
ASSOCLVTED VIRUS P5 
INmATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl: CHAIN: C; ADENO- 
ASSOCL^TED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- ] 
ASSOCL\TED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: QADENO- 
ASSOCL^TED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


SEQFOLD 
score 








83.88 1 




PMF 
score 




1.00 


1.00 




0.40 


Verify 
score 




-0.09 


-0.16 
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PDB annotation 


o 
< 

Ik 


OXIDOREDUCTASE 
GLUTAMIC 

DEHYDROGENASE: , 
GLUTAMATE i 
DEHYDROGENASE. 
ALLOSTERY. ABORTIVE 
COMPLEX 


OXIDOREDUCTASE 

{CHOH(DVNAD+(A))R- 

LACTATE 

DEHYDROGENASE; 2DLD 7 




COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 
YANG 1; TRANSC:RIPTI0N 
INITIATION, INITIATOR f 
ELEMENT, YYI, ZINC 2 
FINGER PROTEIN, DNA. 
PROTEIN RECOGNITION, 3 ! 
COMPLEX (TRANSCRIPTION^ 
REGULATION/DNA) 


St 


CONTRACTILE PROTEIN f [1 
TRIPLE-HELIX COILED 
COIL, CONTRACTILB f n 
PROTEIN ,1 


»• — 

trt 


COMPLEX (GTP- «^ 
BINDlNGmiANSDUCER) U 
BETAI, TRANSDUCIN BETA f j 


Compound 




GLUTAMATE 

DEHYDROGENASE; 

CHAIN:A,B.C.D.E.F; 


D-LACTATE 

DEHYDROGENASE; 2DLD 
5CHAIN:A,B;2DLD6 




QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUnP-OTIDE 
BINDING SITE; CIHAIN: B, 
C; 


YYI; CHAIN: O.ADENO- 
ASSOCL\TED VIRUS P5 
INmATOR ELEMENT 
DNA; CHAIN: A, B; 




HUMAN SKELETAL 
MUSCLE ALPHA-ACTININ , 
2; CHAIN: A; 




GT-ALPHA/GI-ALPHA 
CHIMERA; CHAIN: A; GT- 
BETA; CHAIN: B;GT- 
























SEQFOI 
score 






















PMF 
score 




0.24 


0.75 
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0.01 

i 




0.00 




0.27 


Verify 
score 
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PDB annotation 


CHIMERA PROTEIN, 
RESPIRATORY PROTEIN, 
HEME 






OXYGEN 

STORAGBTRANSPORT HB 
D; HB D HEMOGLOBIN D (R- 
STATE) 1, HEMOGLOBIN. 
AVMN,HIGH2 
COOPERATHTY. OXYGEN 
TRANSPORT 




STORAGEmiANSPORT HB 
D; HB D HEMOCaLOBlN D (R- 
STATE) 1, HEMOGLOBIN, 
AVL\N.HIGH2 

cboPERATirrY. oxygen 

TRANSPORT 


ill i 

^ H n - 

12; 1 i S ^ § 

SE3ps<oP 






OXYGEN TRANSPORT X- J| 
RAY STUDY. PORCINE yl 
HEMOGLOBIN. ARTIHOAL ll 


Compound 


BETA-ALPHA; CHAIN: A, 
B.CD; 


OXYGEN CARRIER 
HEMOGLOBIN (DEOXY) 
IHBH 3 


OXYGEN CARREER | 


HEMOGLOBIN (DEOXY) 
1HBH3 


HEMOGLOBIN D; CHAIN: 
A, C; HEMOGLOBIN D; 
CHAIN: B.D; 


0 

i 

0 
C 


A, C; HEMOGLOBIN D; 
CHAIN: B, D; 


1 HEMOGLOBIN D: CHAIN: 1 


A, C; HEMOGLOBIN D; 
CHAIN: B.D; 


OXYGEN TRANSPORT 
HEMOCSLOBIN (DEOXY) 
IHDA 3 


OXYCffiN TRANSPORT 
HEMOCT-OBIN (DEOXY) 
IHDA 3 


g 

QC 

qE 


1 
1 


SEQFOLD 
score 




120.38 






148.19 


96.34 




136.61 




PMF 1 


score 






LOO 


1.00 






1.00 




1.00 


Verify 1 


score 1 






0.61 


0.93 






0.49 




0.78 
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PDB annotation 










OXYGEN- ^ 
STORAGEATIANSPORT HB 
D; HB D HEMOGLOBIN D (R- 
STATE) 1. HEMOGLOBIN. 
AVL\N.HIGH2 
COOPERATIITY. OXYGEN 
TRANSPORT 


OXYGEN 

STORAGEyTRANSPORT HB 
D; HB D HEMOGLOBIN D (R- 
STATE) 1. HEMOGLOBIN. 1 } 
AVIAN. HIGH 2 f) 
COOPERATIITY. OXYGEN ^ | 
TRANSPORT 'Jj 


§ o ^ § 

CO )J g ^ 

O CO Q S < U H 






Compound 


OXYGEN TRANSPORT 
HEMOGLOBIN (DEOXY, 
HUMAN FETAL F=7II$=) 
IFDHG 1 IFDHH 2 


OXYGEN CARRIER 
HEMOGLOBIN (DEOXY) 
i IHBH 3 


1 OXYGEN CARRIER 
HEMOGLOBm (DEOXY) 
IHBH 3 


OXYGEN CARRIER 
HEMOGLOBIN (DEOXY) 
IHBH 3 


HEMOGLOBIN D; CHAIN: 
A. C; HEMOGLOBIN D; 
CHAIN: B,D; 


HEMOGLOBIN D; CHAIN: 
A, C; HEMOGLOBIN D; 
CHAIN: B,D; 


HEMOGLOBIN D; CHAIN: 
A. C; HEMOGLOBIN D; 
CHAIN: B.D; 


OXYGEN TRANSPORT 
HEMOGLOBIN (DEOXY) 
IHDA 3 


OXYGEN TRANSPORT 
HEMOGLOBIN 0EOXY) 


SEQFOLD 
score 


100.14 




lOLll 


1 76.75 




1 117.48 


81.42 

i 




105.99 


Si 
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1.00 






1.00 




Verify 
score 
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0.49 




Psi 
Blast 


<*> 

cn 


cn 
en 


3.3e-39 


VO 
m 

NO 

vd 


ON 


o\ 

•2 

cn 


3.3e-32 


en 


i 

rn 






ON 


ON 


o\ 


ON 

r- C 


ON 


VO 


ON 

1-1 


ON 


START 
AA 


ON 

m 


(S 


o\ 

CO 


oo 
cn 


1— » 


cn 


cn 
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PDB annotation 






i 


Compound 


GLYCOSYL) LYSOZYMB 
(E.C.3.2.L17) MUTANT 

wrra CYS 54 REPLA< :hi ) 

BY THR. 119L3CYS97 
REPLACED BY ALA. ALA 
134 REPLACED BY SER 
(C54T.C97 A, 1 19L 4 A134S) 
119L5 


HYDROLASECO- 
GLYCOSYL) LYSOZYMB 
(E.C.3.2.1.17) MUTANT 
wrra CYS 54 REPLACED 
BYTHR, 119L3CYS97 
REPLACED BY ALA. ALA 
134 REPLACED BY SER 
(C54T,C97A. 119L 4 A134S) 
119L5 


HYDROLASE (0- 
GLYGOSYL) LYSOZYMB 
(E.C.3.2.L17) MUTANT 
WITH THR 34 REPLA< \H\ ) 
BYALA,174L3LYS 35 
REPLACED BY ALA, SER 
36 REPLACFD BY ALA, 
PRO 37 174L 4 REPLACED 
BY ALA, SER 38 
REPLACED BY ASP, ASN 
40 REPLACED BY 174L5 
ALA, SER 44 REPLACED 
BYALA,GLU45 
REPLACED BY ALA, ASP 
47 174L 6 REPLACED BY 
ALA. LYS 48 REPLACED 
BY ALA, CYS 54 
REPLACED BY 174L 7 THR» 
CYS 97 REPLACED BY 
ALA 

(T34A,K35A,S36AJP37A,S38 
DJI40A,174L8 


SEQFOLD 
score 








PMF 


1 score \ 




0.11 


0.03 


Verify 
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i 




-0.12 


-0.09 


Psl 
Blast 
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PDB annotation 


BINDING PROTEIN 1 




HYDROLASE ERA, GTPASE, 1 


RNA-BINDING, RAS-LKE, 
HYDROLASE 




TRANSLATION EF-TU; 
GTPASE, MOLECULAR 
SWITCH, TRNA, RIBOSOME, 
Q-BETA REPLICASE, 2 
CHAPERONE. DISULFIDE 
ISOMERASE 


TRANSLATION EF-G; BENT 
CONFORMATION, VISIBLE 
DOMAIN m, MUTATION 
HIS573ALA 


TRANSLATION 
TRANSLATIONAL GTPASE % 


PROTEIN BINDING EF-G; EF-J 
Gm^ONGATION FACTOR, □ 
TRANSLOCASE, RIBOSOME, J 
ELONGATION, 2 j: 
TRANSLATION, PROTEIN U 
S YNT FACTOR, GTPASE, C 
OTP BINDING, 3 |^ 
GUANOSINE NUCLEOTIDE \ 
BINDING,, PROTEIN p 
BINDING 




COMPLEX (ZINC »V. 
FINGER/DNA) COMPLEX fl 
(ZINCFINOERyDNA).ZINC fH 


Compound 






GTP-BINDING PROTEIN 1 


ERA; CHAIN: A, B; 


TRANSPORT AND 
PROTECTION PROTEIN 
ELONGATION FACTOR TU 

, (DOMAIND- 

1 *GUANOSINE 

1 DIPHOSPHATE 1ETU4 

1 COMPLEX lETU 5 


ELONGATION FACTOR TU 
(EF-TU); CHAIN: A; 


ELONGATION FACTOR G; 
CHAIN: A; 




TRANSLATION 
rtmiATION FACTOR 
rF2/EIF5B; CHAIN: A; 


Pi o . 
o c ^ 

li 

II 


1 

) 

!« 

) 12 
1 ^ 

4 W 


1 




QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 


SEQFOLD 
score 






















PMF 
score 






-0.12 


-ai4 


-0.12 


-0.14 


-0.09 


-0.14 




0.23 


Verify 1 


score 






0.12 


0.21 


0.09 


0.17 


0.23 


0.19 




-0.28 




Blast 






3.4e-35 


5^ 
6 

r-J 


to 
6 


cs 
^ 


00 
00 

vd 


cs 

l-H 




9.9e-42 


END 1 










OO 

o 

CM 




»n 
1-t 


m 

(S 


»o 

l-H 






START 
AA 








NO 


NO 


ON 


cs 


OV 

T-H 




t-H 


CHAIN 1 


>-* 






< 




< 


< 


< 


< 




< 








lega 


letu 


s 

u 






CO 

*«> 

l-H 


2efg 

1 

1 
1 




lalh 


SEQID 
NO: 






VO 

CO 


OS 
VO 
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0\ 


VO 
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o\ 

VO 

m 
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1 




iii 
u e rt 

III 



81 



I 



u 9 

•li 



Si 

< Q 



ill 



3 



?J u 9 

li 



I 



u 



il il 

^5 Q r;r 



P ^ ^ 




pomu 




is 

y to 



cS o o » y 



8 C 5 2 5 



ii 



e I 



CO 

I 



OL _ _ _ 
O'S Q O PQ U 



Is 



S 

CO 



00 



o 

I— « 



b 2> 

CO 



1-4 

c> 



g 



li 

CO 



CO 



1^ 



o 

CO 

O 



NO 



^0 
oo 
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— FCll 






PDB annotation 


COMPLEX (2INC 
FINCJERmNA) COMPLEX 
(ZINC FINGKR/DNA). ZINC 
FINGER. DNA-BINDING 
PROTEIN 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGEI 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN. 2 CRYSTAL 
STRUCTURE. COMPLEX 
(ZiNCFmGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGEI 
PROTEIN-DNA 
INTERACTION; PROTEIN 
DESIGN. 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGHR/DN A) ZINC FINGEI 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE. COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGEI 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGE] 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 


Compound 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN: B. 
C; 


DNA; CHAIN: A, B, D. E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA;(3IAIN:A.B.D,E: 
CONSENSUS ZINC FINOHR 
PROTEIN; CHAIN: C, F. G; 


DNA;CHAIN:A.B.D.B; 
CONSENSUS Z&tC FINGER 
PROTEIN; CHAIN: C. F, G; 


DNA;CHAIN:A,B.D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A,B.D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


SEQ FOLD 
score 














PMF 
score 


0.13 


LOO 


1.00 


1.00 


1.00 


1.00 


Verify 
score 


0.20 


0.49 


0.55 


0.44 


0.38 


0.08 


Psi 
Blast 


<o 
ts 


! 

«^ 


OO 

\d 


l 


1.2e-47 


00 


is 


P: 




m 


f* 

VO 




cn 


START 
AA 






*n 






NO 


i CHAIN 
ID 


< 


a 


a 


U 


a 


O 




lalh 


Imey 


imey 


1 


Imey 


Imey 


SEQ ID 
NO: 


o 

CO 


? 


CO 


o 

CO 


o 
r- 

CO 


o 

CO 
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B 
o 
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o 

g 

9 

e 
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o 

I 

o 
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CO 
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s 



8 



8 



8 
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^ 2 

c - 



5f 

d 



o 

CO 
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'a 



a 
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o 
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e 
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h 
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^2 
S 8 
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5? 



^ CI} 
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m O 



J ^ 

ill 



1 
§ 

I 

e 
U 







22, 



SO! 

r* CO 



^ 8 
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^1 

^ CO 
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PDB annotation 


FJ.HMENT,YY1,ZINC2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION. 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPTKX (DNA-BINDING J 
PROTEIN/DNA) FIVE- M 
FINGER GU; GLI, ZINC ^ 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) HVE- 
FINGERGU; GLI, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GU; GLI, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PR0TH1N/DNA)HVE- 
FINGER GLI; GLI, ZINC 
FINGER. COMPLEX (DNA- 
BINDING PROTEIN/DNA) ^ 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- ,i 
FINGER GU;GU.ZINC J 
FINGER, COMPLEX (DNA- M 
BINDING PROTEIN/DNA) W| 


o 


TRANSPORT PROTEIN TC4;flJ 
GTPASE. NUCLEAR 
TRANSPORT, TRANSPORT g 
PROTEIN H 


TRANSPORT PROTEIN TC4;C^ 
GTPASE, NUCLEAR Iw 
TRANSPORT, TRANSPORT 
PROTEIN flJ 


Compound 




ZINC FINGER PROTEIN 
GLIl; CHAIN: A; DNA; 
CHAIN: CD; 


ZINC FINGER PROTEIN 
CaJl; CHAIN: A; DNA; 
CHAIN: CD; 


ZENC FINGER PROTEIN 
GLIl; CHAIN; A; DNA; 
CHAIN: CD; 




ZINC FINGER PROTEIN 
GLIl; CHAIN: A; DNA; 
CHAIN: CD; 


ZINC FINGER PROTEIN 
GUI; CHAIN: A: DNA; 
CHAIN; CD; 




GTP-BINDING PROTEIN 
RAN; CHAIN: A, B; 


GTP-BINDING PROTEIN 
RAN; CHAIN: A, B; 


SEQFOLD 
score 






99.97 










55.66 




PMF 
score 




1.00 




1.00 


0.99 


0.64 






1.00 


Verify 
score 




00 

in 
o 




0.29 


0.45 


0.09 






0.62 


PsI 
Blast 




L4e.31 


cn 
6 
^. 
cn 


cn 

^. 

cn 


cn 


1.7e-29 




00 

1 

1—1 


1.7e-48 






o 


ov 
cn 


? 

cn 


»o 






VO 
OO 
*— t 


t-H 

00 


START 
AA 






o 

00 




OO 

cn 


ON 
OO 






o 


CHAIN 




< 


< 


< 


< 


< 




< 








2gli 




2gli 


•pi* 


CM 




Ibyu 


Ibyu 


SEQID 
NO: 
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1 
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5 6 




PCI/ 
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fc O o 

■ 'S: ■ 
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B 8 
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CO CQ fti U 






CO Ai 




I 



b 1 



Oi 






CO 



ii> 



8 



g 



O 
O 



I 



(0 



eg 

CO 



03 



"J 



00 



^0 
^0 



ON 



0\ 
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PDB annotation 





COMPLEX (SMALL 
1 GTPASE/NUCLEAR 
PROTEIN) COMPLEX 
(SMALL GTPASE/NUCLEAR 
PROTEIN), SMALL GTPASE, 
2 NUCLEAR TRANSPORT 


COMPLEX(GTPASB 
ACTIVATN/PROTO- 
ONCOGENE) OTPASE- 
ACnVATING PROTEIN 
RHOGAP; COMPLEX 
(GTPASE 

ACTIVATION/PROTO- 
ONCOGENE), GTPASE, 2 
TOANSmON STATE, GAP 


COMPLEX (OTP- « 
BINDING/EFFECTOR) RAS- * 
RELATED PROTEIN RAB3 AH 
COMPLEX (GTP- 
BINDING/EFFECTOR), G Q 
PROTEIN, EFFECTOR, {n 
RABCDR, 2 SYNAPTIC q 
EXOCYTOSIS.RAB 
PROTEtN,RAB3A. 
RABPHILIN 


COMPLEX (GTP- P 
BINDING/EFFECTOR) RAS- f:^ 
RELATED PROTEIN RAB3A|y 
COMPLEX (GTP- 


!5: 

i 


Compound 


RAS P21 PROTEIN 
MUTANT WITH GLY 12 
REPLACED BY PRO IPU 3 


(G12P) COMPLEXED WITH 
P3-l-(2- 

NTTROPHENYDETHYL- 
1PU4GUANOSINE-5'- 
(B,G-IMIDO)- 
TRIPHOSPHATE IPLJ 5 


RAN; CHAIN: A, C; 
NUCLEAR PORE 
COMPLEX PROTEIN 
NUP358; CHAIN: B, D; 


P50-RHOGAP; CHAIN: A; 
TRANSFORMING PROTEIN 
RHOA; CHAIN: B; 


RAB-3A; CHAIN: A; 


< 


i< 




SEQFOLD 
score 




104.86 


73.28 


115.76 




PMF 
score 










1.00 


Verify 
score 










0.76 


Psi 
Blast 




8.5e-52 

1 


00 


1.76-64 


1.7e-64 






SI 


§^ 


§ 




START 
AA 




1— « 
cn 


CO 




o 


CHAIN 
ID 




U 


pa 


< 


< 


is 








1 


Izbd 


SEQ ID 
NO: 






m 

00 




m 

00 

cn 
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PDB annotation 


1 

Pi 


LIGASE CVa ,TN A/CDKi- 
ASSOCIATED PROTEIN P45; 
CYCLIN A/CDK2- 
ASSOaATED PROTEIN P19; 
SKPl, SKP2, F-BOX, LRR. 
LEUCINE-RICH REPEAT, 
SCF. UBIQUITIN. 2 E3. 
UBIQUTTIN PROTEIN 
UGASE 


UGASE CYCLIN A/CDK2. 
ASSOCIATED PROTEIN P45; 
CYCLIN A/CDK2- 
ASSOCL\TED PROTEIN P19; 
SKPl, SKP2. F-BOX, LRR, 
LEUCINE-RICH REPEAT, 
SCF, UBIQUITIN, 2 E3, 
UBIQUITIN PROTEIN 
LIGASE 


LIGASE SKP2 F-BOX; SKPl; 
SKPl, SKP2, F-BOX, LRR, 
LEUCINE-RICH REPEAT, «^ 
SCF, UBIQUTTIN, 2 E3, ^ 
UBIQUTTIN PROmiN H 
UGASE ^ 


LIGASE CYCLIN A/CDK2- \\ 
ASSOCLMKDP45; CYCLIN dj 
A/CDK2-ASSOCIAriil>P19; ^ 
SKPl. SKP2. F-BOX, LRRS. g 
LEUCINE-RICH REPEATS, 2*. 
SCF, 2 UBIQUTTIN, E3, ^ 
UBIQUTTIN PROTEIN \ 
LIGASE Q 


LIGASE CYCLIN A/CDK2- j;* 
ASSOaAl"HDP45; CYCLIN fU 
A/CDK2-ASSOCL\TBDP19; ri. 
SKPl, SKP2, F-BOX, LRRS, 


Compound 




SKP2;CHAIN:A.C.E.G.I. 
K.M.O; SKPl; CHAIN: B. 
D.F.H.J.L.N.P: 


w S 
u a 

<f u 
53 Q 


CYCUN A/CDK2- 
ASSOCIATED P19; CHAIN: 

A, C;CYCUNA/CDK2- 
ASSOCL\TED P45; CHAIN: 

B. D; 


SKP2; CHAIN: A, C; SKPl; 
CHAIN: B.D; 


SKP2; CHAIN: A. C; SKPl; 
CHAIN: B.D; 


QFOLD 
score 
















CO 
















PMF 
score 




1.00 


0.24 


0.70 


1.00 


0.39 


Verify 
score 




0.25 


0.28 


-0.61 


0.68 


0.23 


Psi 
Blast 




»»> 

m 
1 

^. 

CO 


L7e-ll 




5.1e-35 


1.7e-ll 






9LZ 


00 








START 
AA 






00 






00 


CHAIN 
ID 




< 


< 


< 




< 






Ifqv 


Ifqv 


Ifsl 






SEQID 
NO: 




00 
00 

cn 


00 
00 

en 


eo 
oo 
ro 


00 

oo 
cn 


00 

cn 
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PDB annotation 


LEUCINE-RICH REPEATS, 
SCF. 2 UBIQUITIN. E3, 
UBIQUITIN PROTEIN 
LIGASE 


w R i 

^ i i 


1 

§1 


RIBONUCLEASE/ANGIO(ffiN 
IN INHIBITOR 
ACETYLATION. LEUCINE- 
RICH REPEATS 


ACETYLATION RNASE 
INHIBITOR. 

RIBONUCLEASE/ANGIOGEN 
IN INHIBITOR 
ACETYLATION. LEUCINE- 
RICH REPEATS 






IP* 


COMPLEX (ZINC gi 
FINGER/DNA) COMPLEX TM 
(ZINC FINGHk/DNA). ZINC 
FINGER, DNA-BINDING H 
PROTEIN H 


r< ^ ^ ^ 


Compound 




RIBONUrLRASE 
INHIBITOR; CHAIN: NULL; 


RIBONUCLEASE 
INHIBrrOR; CHAIN: NULL; 




|g 

ii 




LEUCINE ZIPPER GCN4 


(BASIC REGION, LEUCINE 
ZIPPER) COMPLEX WITH 
AP-1DNA1YSA3 




QGSR ZINC FINGER 


PEPTIDE; CHAIN: A; 
DUPLEX 

OUGONUCLEOTIDE 


BINDING SITE; CHAIN: B, 
C; 




PEPTIDE; CHAM: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 


SEQFOLD 
score 




















PMF 
score 




0.62 


0.71 


0.95 




-0.19 




0.92 


1.00 


Verify 
score 




0.15 


0.20 


0.35 




0.08 




0.01 


0.13 


Psi 
Blast 




1.2e-12 


1.4e-ll 


lO 

1^ 




VO 
VO 




5.1e-27 


<^ 

% 






^ 


«s 






t«-4 








START 
AA 


















CO 


CHAIN 
ID 












a 




<: 


< 






1. 


2bnh 








lysa 




lalh 


lalb 


SEQID 
NO: 




00 
00 

m 


00 ( 
OO ( 
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OS 
OO 
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PDB annotation 




COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGHRADNA). ZINC 
FINGER, DNA-BINDING 
PROTEIN 


COMPTKX(ZINC 
FINGER)DNA) COMPLEX 
(ZINC FINGHR/DNA), ZINC 
FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACriON, PROTEIN 
DESIGN. 2 CRYSTAL 
STRUCTURE. COMPLEX 
(ZINCFINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER. 
PROTCIN-DNA 

INTERACTION, PROTEIN % 
DESIGN, 2 CRYSTAL f 
STRUCTURE, COMPLEX ^ 
(ZINCFlNGiiR/DNA). 


COMPLEX (ZINC J 
FINGHR/DNA) ZINC FINGER,5| 
PROTEIN-DNA ¥ 
INTERACTION. PROTEIN ^ 
DESIGN, 2 CRYSTAL H 
STRUCTURE, COMPLEX \ 
(ZINC FINGHR/DNA) ^ 


COMPLEX (ZINC n 
FINGER/DNA) ZINC FINGERT 
PROTEIN-DNA TU 
INTERACTION, PROTEIN f^j 
DESIGN. 2 CRYSTAL m 


Compound 


a 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; . 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUaPOTEDE 
BINDING SITE; CHAIN: B, 
C; 


DNA;CHAIN:A,B,D,E; 
CONSENSUS ZINC FIN(SER 
PROTEIN; CHAIN: C. F, O; 


DNA; CHAIN: A,B.D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C,F,G; 


DNA; CHAIN:A,B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C.F.G; 


DNA; CHAIN: A,B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


SEQFOLD 
score 




72.18 








83.29 




PMF 
score 






0.51 


0.65 


1.00 




0.98 


Verify 
score 






9 


-0.20 


0.13 




d 


Psi 
Blast 




1.7e-29 




3.4e-41 


1.7e-50 


o 

V 
«3 










1 




CO 




NO 

M 


START 
AA 










pi 

IB-* ■ 


P! 


O 


s 
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< 


U 
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Imey 


Imey 


Imey 




SEQID 
NO: 
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COMPLEX (TRANSCRIPTK 
REGULATION/DNA) 
GABPALPHA; GABPBETAl 
COMPLEX (TRANSCRIPTIC 
REGULATION/DNA). DNA- 
BINDING. 2 NUCLEAR 
PROTEIN, ETS DOMAIN, 
1 ANKYRIN REPEATS, 
TRANSC3UPTION 3 FACTO 


PDB annotation 


PROTEIN. CALCIUM- 
BINDING 2JROTEIN. 
PHOSPHATIDYLSERIN] 
PROTEIN KINASE C 










ANTI-ONCOGEJ^CRIT 
CYCn:-E. ANTI-ONCOGE 
REPEAT. ANK REPEAT 


1 COMPLEX (TOANSCRIP 
REGULATION/DNA) 
GABPALPHA; GABPBE1 
COUPLEK (TRANSCRIP 
REGULATION/DNA), DI 
BINDING. 2 IWCLEAR 
PROTEIN, ETS DOMAIN 








)LIPID 

1 


)LIPID 


)LIPID 






GA BINDING PROTEIN 
ALPHA;C3IAIN:A;GA 
BINDING PROTEIN BETA 
1 ; CHAIN: B; DNA; CHAIN: 
D.E; 


GA BINDING PROTEIN 
ALPHA; CHAIN: A; GA 
BUYING PROTEIN BET A 
1; CHAIN: B; DNA; CHAIN: 
D.E; 


Compound 






CALCIUM/PHOSPHC 
BINDING PROTEIN 
SYNAPTOTAGMIN] 
(FIRST C2. DOMAIN] 
(CALB) IRSY 3 


CALCIUM/PHOSPHC 
BINDING PROTEIN 
SYNAPTOTAGMIN] 
(FIRST C2 DOMAIN) 
(CALB) IRSY 3 


CALCIUM/PHOSPHC 
BINDING PROTEIN 
SYNAPTOTAGMIN] 
(FIRST C:2 DOMAIN) 
(CALB) IRSY 3 




! TUMOR SUPPRESSC 
P16I1^4A; CHAIN: f 










































SEQ] 
set 








169.67 












PMF 
score 




1.00 




1.00 




LOO 


0.03 


LOO 


Verify 
score 




0.70 




0.79 




j 0.20 


1 

oro- 


(O 

f— ( 
d 


Psi 
Blast 




1 
o\ 


1 


1 

CO 




CO 
CN 

NO 


5.1C-34 


CO 






CN 










CO 




START 
AA 




cs 
<n 

t-H 


Pi 

l-H 






o 

t-<l 






CHAIN 
ID 














« 




gs 




Irsy 




Irsy 




o 

•a 

»— » 


lawc 


lawc 


SEQID 

NO: 1 




a\ i 

CO i 


-o 


ON 
fO 




o\ 

CO 


so 

CO 


CO 
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PDB annotation 


ANKYRIN REPEATS. 
TRANSCRIPTION 3 FACTOR 


COMPLEX (TRANSCRIFnON 
REGULATION/DNA) 
GABPALPHA: GABPBETAl: 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA). DNA- 
BINDING. 2 NUCLEAR > , 
PROTEIN. ETS DOMAIN. i [ 
ANKYRIN REPEATS. 
TRANSCRIPTION 3 FACTOR 


TUMOR SUPPRESSOR 
TUMOR SUPPRESSOR, 
CDK4/6 INHIBITOR, 
ANKYRIN MOTIF 


TUMOR SUPPRESSOR 
TUMOR SUPPRESSOR. 
CDK4/6 INHIBITOR, 


TUMOR SUPPRESSOR 
TUMOR SUPPRESSOR, 
CDK4/6 INHIBITOR, 


/yininl 1 ivu.^ J.VIW 1 ir 1 

COMPLEX (KINASE/ANTI- ^ 
ONCOGENE) CDK6: fl 
P16INK4A.MTS1:CYCLIN «i 
DEPENDENT KINASE. v ' 
CYCUN DEPENDENT 2 
KINASE INHIBITORY 2 hk 
PROTEIN, CDK. INK4, CRT T , 
CYCLE, MULTIPLE TUMOR C 
SUPPRESSOR, 3 MTSl, HJ 
COMPLEX (KINASE/ANTI- \ 
ONCOGENE) HEADER 


COMPLEX (iNHmrroR c 

! PROTEIN/KINASE) 
INHIBrrOR PROTEIN. ^ 
CYCLIN-DEPENDENT . sli 
KINASE, CELL CYCLE 2 fij 


1 
t 

o 
U 




GA BINDING PROTEIN 
ALPHA; CHAIN: A; GA 
BINDING PROTEIN BETA 
1: CHAIN: B; DNA: CHAIN: 
D.E: 


P19INK4DCDK4/6 
INHIBITOR; CHAIN: NULL; 


P19INK4DCDK4/6 
INHIBITOR; CHAIN: NULL; 


P19INK4DCDK4/6 
INHIBITOR; CHAIN: NULL; 


CYCLIN-DEPENDENT 
KINASE 6; CHAIN: A: 
MULTIPLE TUMOR 
SUPPRESSOR; CHAIN: B; 


CYCLIN-DEPENDENT 
KINASE 6; CHAIN: A; 
P19INK4D; CHAIN: B; 


SEQFOLD 
score 
















PMF 
score 




1.00 


0.88 


1.00 


0.81 


1.00 


1.00 


Verify 
score 




0.16 


0.14 


0.09 


0.21 


0.25 


0.28 


Psi 
Blast 




6 

CO 

\d 


L3e-30 


le-27 


ON 


1.6e-20 


VO 






On 

r-t 


00 

a\ 


ON 
i-« 






On 


START 
AA 






o 


CM 


00 


o 


CM 
« 


CHAIN 
ID 




PQ 










A 






1 


QO 

S 


IbdS 


00 

S 


is 


^< 

,o 
1— « 


SEQID 
NO: 




VO 
Ov 
cn 


VO 
ON 


NO 
ON 


so 

ON 


so 

ON 

m 
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PDB annotation 


FINGER/DN A) ZINC FINGER. 
PROTEIN-DNA 
INTERACTION. PROTEIN 
DESIGN. 2 CRYSTAL 
STRUCTURE. COMPLEX 
(ZINCFINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA)ZINCFINCffiR, ,| 
PROTEIN-DNA I 
INTERACTION. PROTEIN 
DESIGN. 2 CRYSTAL 
STRUCTURE. COMPLEX 
(ZINCFINGER/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) TFUIA; 
5S GENE; NMR. TFiUA, 
PROTEIN. DNA, 
TRANSCRIPTION FACTOR, 
5S RNA 2 GENE, DNA 
BINDING PROTEIN. ZINC 
FINGER, COMPLEX 3 
(TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTIOI"^ 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTIO^f 
REGULATION/DNA), RNA 
POLYMERASE m, 2 
TRANSCRIPTION Q I 
INTTLVTION, ZINC FINGER 
PROTEIN Q 


COMPLEX (TRANSCRIPTIONx 
REGULATION/DNA) ^ 
COMPLKC (TRANSCRIPTIOlt 
REGULATION/DNA), RNA t 
POLYMERASE m, 2 N 
TRANSCRIPTION fl 
INITIATION, ZINC FINGER f! 
PROTEIN m 


Compound 


CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C. F, G; 




DNA;CHAIN:A,B.D.E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C. F. G; 


TRANSCaOPTION FACTOR 
TTTA! CHAIN: A- 5S RNA 


GENE; CHAIN: E.F; 


TFinA;CHAIN:A,D:5S 
RIBOSOMAL RNA GENE; 


CHAIN: B,C,E,F; 


TFIHA; CHAIN: A, D; 5S 
RIBOSOMAL RNA CffiNE; 
CHAIN: B,C,E.F; 


SEQFOLD 
score 






60.10 


114.59 




PMF 
score 




0.04 






1.00 


Verify 
score 




-0.12 






-0.07 


PsI 
Blast 




OO 

cn 

00 


00 
»-< 

6 

1-4 


ee 
cn 


00 

cn ■ 


is 






0^ 
O 
cn 


cn 


»o 
cn 


START 
AA 




cn 








CHAIN 
ID 




U 


< 


< 
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Iraey 




1-H 




SEQID 
NO: 
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PDB annotation 


DESIGN, 2 CRYSTAL 
STRUCTURE. COMPLEX 
(ZINCFINGERn)NA) 


COMPLEX (ZINC 
HNGER/DNA) ZINC FINOliK. 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN. 2 CRYSTAL | 
STRUCTURE. COMPLEX V 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER; 
PROTEIN-DNA 
INTERACTION. PROTEIN 
DESIGN. 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINCFINGHR/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINUHK, 
PROTEIN-DNA 
INTERACTION. PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
CZINC FINGER/DNA) 


COMPLEX (ZINC P% 
FINOHR/DNA) ZINC FINGER J 
PROTEIN-DNA s' 
INTERACTION, PROTEIN ^ 
DESIGN, 2 CRYSTAL V 
STRUCTURE, COMPLEX W 
(ZINCFINGER/DNA) Q 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGERk 
PROTEIN-DNA = 
INTERACTION, PROTEIN ^ 
DESIGN, 2 CRYSTAL »^ 
STRUCTURE, COMPLEX HJ 
(ZINCFINGERA)NA) fU 


1 COMPLEX (ZINC pil 




Compound 




DNA; CHAIN: A, B. D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C. F, G; 


1 DNA; CHAIN: A, B.D.E; 
! CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C. F. G; 


DNA; CHAIN: A, B.D.E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F. G; 


DNA; CHAIN: A, B, D.E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C,F,G; 


DNA; CHAIN: A, B, D.E; 
CONSENSUS ZINC FINC^ 
PROTEIN; CHAIN: CF.G; 


DNA; CHAIN: A. B, D.E; | 




SEQFOLD 
score 














95.94 


Tables 


PMF 
1 score 




1.00 


1.00 

L 


LOO 


1.00 


1.00 






Verify 
score 




0.06 


-0.19 


0.15 


0.13 


d 






Psi 
Blast 






! 
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00 

NO 
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cn 








\o 




START 
AA 
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00 
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en 
to 




CHAIN 
ID 




U 




u 


U 


U 


a 




Is 




i 

*— t 


1 


Imey 


Imey 


Imey 


1 Imey ' 




SEQID 
NO: 
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PDB annotation 


STRUCTURE. COMPLEX 
(ZINCFINGER/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE m. 2 
TRANSCRIPTION iJ 
INITIATION, ZINC FINGER 1 
PROTEIN 1 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE m. 2 
TRANSCRIPTION 
INITIATION. ZINC FINGER 
PROTEIN 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE m,2 
TRANSCRIPTION ^ 
INITIATION, ZINC FINGER 
PROTEIN • 


COMPLEX (TRANSCRIPTIO?n 
REGULATION/DNA) 
COMPI^X (TRANSCRIPTIOia 
REGULATION/DNA), RNA 0! 
POLYMERASE m. 2 jQ 
TRANSCRIPTION nj 
INITL\TION. ZINC FINGER C 
PROTEIN ~ 


i COMPLEX (TRANSCRIPTION^ 
REGULATION/DNA) 
COMPLEX (TRANS(3UPTI0]^ 
REGULATION/DNA), RNA Hj 
POLYMERASE m. 2 r I 


Compound 




TFIIIA;CHAIN:A.D;5S 
RIBOSOMAL RNA CffiNE: 


CHAIN: B,C.E.F; ' 


TFinA;CHAIN:A,D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B, C.E,F; 


TFIIIA;CHAIN:A,D;5S 
RIBOSOMAL RNA GENE; 
C3IAIN:B,C.E.F; 


TFinA;CHAIN:A,D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B. C,E,F; 


TFIIIA;CHAIN:A,D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B, C,E,F; 


SEQFOLD 
score 








104.78 






PMF 
score 




0.66 


0.83 




1.00 


0.43 


Verify 
score 




0.08 


-0.09 




0.01 


-0.46 


Psi 
Blast 




m 

vi 


le-38 


1.7e-36 




cn 

CO 








00 

VO 




cn 


B 


START 
AA 




1—1 
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cn 
cn 


»n 


CHAIN 
ID 




< 
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VO 




ltf6 


ltf6 


SEQID 
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PDB annotation 


(CYTOKINE/RECEPTOR) 
EPOBP; ERYTHROPOIHTIN, 
ERYTHROFOIHI'IN 


RECEPTOR. SIGNAL 2 
TRANSDUCTION, 
HEMATOPOIETIC 
CYTOKINE. CYTOKINE 
RECEPTOR 3 CLASS 1. g\ 
COMPLEX U 
(CYTOKINE/RECEPTOR) T 




CELL ADHESION PROTEIN 
ROD. EXTRACELLULAR 
MATRK IFNF 18 


HEPARIN AND INTEGRIN | 


BINDING HEPARIN AND 
INTEGRIN BINDING 


HEPARIN AND INT1EGRIN 1 


BINDING HEPARIN AND 
INTBCaUN BINDING 


m ,T , ADHESION PROTEIN 
CF,T ,T , ADHESION PROTElN.fj 
RGD.EXTOACELLULAR . 
MATRIX. 2 HEPARIN- 7^ 
BINDING, GLYCOPROTEIN 


CELL ADHESION PROTEIN )m 
CELL ADHESION PROTEIN.yf ' 
RGD. EXTRACELLULAR 0 
MATRIX, 2 HEPARIN- fii 


BINDING, GLYCOPROTEIN v 
OPT .7 . ADHESION PROTEIN q 
CFJ T , ADHESION PROTEIN.g 
RGD, EXTRArFJT,ULAR ^ 
MATRIX, 2 HEPARIN- fU 
BINDING. GLYCOPROTEIN rU 


1 STRUCTURAL PROTEIN RJ| 


Compound 


CHAIN: A; 


ERYTHROPOIETIN 
RECEPTOR: CHAIN: B.C; 




CELL ADHESION PROTEIN 
FIBRONECTINCFIJ^ 
ADHESION MODULE TYPE 
in-10lFNA3 


FIBRONECTIN; IFNF 6 
CHAIN: NULL; IFNF 7 


FIBRONECTIN; CHAIN: A; 


FIBRONECTIN; CHAIN: A; 1 




HBRONECTIN; CHAIN: 
NULL; 


FIBRONECTIN; CHAIN: 
NULL; 


FIBRONECTIN; CHAIN: 
NULL; 




1 INTEGRIN BET A-4 


SEC FOLD 1 


score 






92.46 


81.90 




60.33 






1 66,57 


An 


score 




0.69 






0.19 




0.07 


0.64 




Verify 
score 




-0.27 






»n 

d 
1 




-0.02 


-0.40 






Blast 




8.5e-14 


8.5C-32 


le-27 


xn 
«n 


1.7e-26 


On 
On 

ON 


1.7e-14 


1 1.2€^25 






& 


cs 

VO 
cn 


On 
OJ 


VO 

o 






& 


»-< 


START 
AA 






»— 1 
cs 


m 


cn 


i-H 


cn 

CM 




CM 




a 








< 


< 














CD 

a 


Ifof 




Ifhh 


Imfh 


Imfh 




cn 
b 

cr 


SEQID 
NO: 




m 
5 


cn 
Tf 


cn 
f— ♦ 


cn 
5 


cn 


cn 


cn 

5f 


cn 
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e 
I 

a 




§8g 




^ 5 ^ 






^^^^ 



1^ 



3 



CO 



8 



u 
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o 

I 








EE} 
CO 



s 



5 



a 



e 

CO 



NO 



P3 



sr 
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PDB annotation 


SERINEnHREONINE- 
PROTEIN KINASE, MAP 
KINASE. 2 ERK2 




SERINE PROTEASE SERINE 
PROTEINASE. TRYPSIN. 
HYDROLASE 


SERINE PROTEASE SERINE .|n 
PROTEINASE, TRYPSIN, S 
1 HYDROLASE 1 


SERINE PROTEINASE 
TRYPSIN-LIKE SERINE 
PROTEINASE, TKl'KAMER, 
HEPARIN, ALLERGY, 2 
i ASTHMA 


SERINE PROTEASE 
HYDROLASE. SERINE 
PROTEASE 


SERINE PROTEASE 
HYDROLASE, SERINE 
PROTEASE 


SERINE FROTEASli 
PRORENIN CONVERTING 
ENZYME (PRECE). 
EPIDERMAL GLANDULAR fJ 
KALLIKREIN. SERINE ^ 
PROTEASE. PROTEIN C 
MATURATION 


COMPUEX (BLOOD Tyi 
COAGULATION/INHIBITORg 
AUTOPROTHROMBINnA; O 
HYDROLASE. SERINE Rj 
PROTEINASE). PLASMA \ 
CALCIUM BINDING. 2 q 
GLYCOPROTEIN, COMPUEX 
(BLOOD J* 
COAGULATION/INEnBITORSU 


SERINE PROTEASE SERINEfy 
PROTEASE HEADER fU 


Compound 






TRYPSIN; CHAIN: A, B, C, 
D; 


TRYPSIN; CHAIN: A, B, C, 
D; 


BETA-TRYPTASE; CHAIN: 
A,B,C,D; 


ALPHA TRYPSIN; CHAIN: 
A.B; 


ALPHA TRYPSIN; CHAIN: 
A,B; 


GLANDULAR 
KALLIKREIN-IS; CHAIN: 
A,B; 


ACTIVATED PROTEIN C; 
CHAIN: C,L;D-PHE-PRO. 
MAI; CHAIN: P; 


ALPHA THROMBIN; 
CHAIN: A,B.RE; 


SEQFOLD 
score 






172.43 




124.61 






132.64 


124.65 




PMF 
score 








1.00 




1.00 


0.99 






0.87 


Verify 
score 








0,76 




0.26 


-0.48 






S 
o 


Psi 
Blast 






o 


o 


3.4e-81 


1.7e-49 




ON 


5.16-72 


3.4e-35 








T-H 










1—1 




*-H 


START 
AA 






OO 

f-H 




00 

•-4 






1— » 


OO 




|a 


























u 






< 


< 


< 


< 




< 




u 




PQ 










OO 


laOl 


laks 


laks 


•8 


laut 


Ibhx 


SEQID 
NO: 






1— i 










t~- 
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PDB annotation 


1 HETNAM 1 


SERINE PROTEASE SERINE 
PROTEASE HEADER 
HETNAM 


SERINE PROTEASE SERINE 
PROTEASE, HYDROLASE, 
COMPLEMENT, FACTOR D, j 
CATALYTIC 2 TRIAD, SELF-' ||| 
REGULATION Jl 


BLOOD CLOTTING TSV-PA; 
FIBRINOLYSIS, 
PLASMINOGEN 
ACTIVATOR, SERINE 
PROTEINASE, 2 SNAKE 
VENOM, COMPLEX 
(HYDROLASE/INHIBrrOR), 
BLOOD CLOTTING 


BLOOD CLOTTING TSV-PA; 
FIBRINOLYSIS, 
PLASMINOGEN 
ACTIVATOR, SERINE 
PROTEINASE, 2 SNAKE 
VENOM, COMPLEX 
(HYDROLASE/INHIBITOR). 
BLOOD CLOTTING n 


COMPLEX (SERINE ;j 
PROTEASE/INHIBITOR) P 
INFLAMMATION, ^ 
INHIBITOR, SPECIFICITY, « 
SERINE PROTEASE, 2 M; 
COMPLEX (SERINE Q 
PROTEASE/INHIBITOR) fU 


SERINE PROTEASE v 
HYDROLASE, SHRINE ^ 
PROTEASE, DIGESTION, 
PANCIIEAS, ZYMOGEN, 2 
SIGNAL, MULTIGENE fU 
FAMILY OJ 


1 
1 


Compound 




ALPHA THROMBIN; 
CHAIN:A.B,F.E; 


COMPLEMENT FACTOR D; 
CHAIN: NULL; 


PLASMINOGEN 
ACTIVATOR; CHAIN: A, B; 
GLU-GLY-ARG- 
CHLOROMETHYLKETONE 


s: 

r 


PLASMINOGEN 1 
ACTIVATOR; CHAIN: A, B; 
GLU-GLY-ARG- 
CHLOROMETHYLKKrONE 
INHIBITOR; CHAIN: E, F; 


CATHEPSIN G; CHAIN: A; 
PHOSPHONATE 
INHIBITOR SUC-VAL-PRO- 
PHEP-(0PH)2; CHAIN; S; 


TRYPSIN; CHAIN: NULL; 


ENTEROPEPTIDASE; | 


SEQFOLD 
score 






133.21 


154.29 




126.65 


165.05 


136.64 


PMF 
score 




•0 

o 






1.00 








Verify 
score 




-0.62 






0.69 








Psi 
Blast 




«^ 


oo 

NO 

6 


t 

oo 

NO 


00 

\d 


1— t 
r- 
6 

00 


vJ 


1 lc-79 1 


u< 




00 














START 
AA 






«— » 




00 
«— 1 




00 




; CHAIN 
ID 








< 


< 


< 




n 






Ibhx 


Ibio 


Ibqy 


1. 


1 


Idpo 


1 lekb 1 


SEQID 
NO: 




5 


r- 
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I CO ^' ^ W s 





BOE 



V3 CO 



i 

I 

o 









i 



£ § 



8 



V 8 



!53 ^ 



la 



c5 



ON 



0\ 



8) 



60 
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©02 



no 
§ 

I 



5 P O O cn 



W H "5! n 

^ log 

n ^ Q p ' 



o 





o 2 
^ S 



in 

I 



I" Si 



* pa 



CO 

is 



CO 
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O 

I* 

a 




I 



i 

£ S ^ „- ^ 






o o 



-I 

O 1 Q U I 





• . ?N 




6 § 5 S 







CO O < 



I 



oo 
m 



oo 
O 



d 



5 



1^ 



8 



1^ 



U9 



9 
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PDB annotation 


Q O 

il 




IMMUNOGLOBULIN 
IMMUNOGLOBULIN, FAB 
COMPLEX, IDIOTOPE, ANTI- 
roiOTOPE 


IMMUNOGLOBULIN 
IMMUNOGLOBULIN, FAB 
COMPLEX, IDIOTOPE. ANTI- 
IDIOTOPE 


IMMUNE SYSTEM VUJN 
Wn -T .RBRAND FACTOR. 
GLYCOPROTEIN IBA 
(A:ALPHA) BINDING, 2 
COMPLEX 


8 s 

ml 




Si J 


Compound 


CHAIN: L; CATALYTIC 
ANTIBODY 1E9 (HEAVY 


CHAIN); CHAIN: H; I 


IG HEAVY CHAINV 
REGIONS; CHAIN: A; IG 
HEAVY CHAINV 
REGIONS; CHAIN: B; IG 
HEAVY CHAINV 
REGIONS; CHAIN: C; IG 
HEAVY CHAINV 
REGIONS; CHAIN: D; 


IG HEAVY CHAINV 
REGIONS; CHAIN: A; IG 
HEAVY CHAINV 
REGIONS; CHAIN: B;IG 
HEAVY CHAINV 
REGIONS; CHAIN: C;IG 
HEAVY CHAINV 
REGIONS; CHAIN: D; 


IMMUNOGLOBULIN NMC- 
4IGG1;CHAIN:L; 
IMMUNOGLOBULIN NMC- 
4IGG1;CHAIN: H;VON 
WILLEBRAND FACTOR; 
CHAIN: A; 


IMMUNOGLOBULINA^U 
S HEMAGGLUTININ 
IGG2AFAB FRAGMENT 


(FAB 26/9) COMPLEXED 
WITH INFLUENZA IFRG 3 
HEMAGGLUTININ HAl 
(STRAIN X47) (RESIDUES 
101 -108) IFRG 4 


o 1 

ii 


SEQFOLD 
score 




65.86 




• • 


66.17 


66.74 


PMF 
score 






o 


o 
d 






Verify 
score 






8 
9 


»— « 

d 






Psi 
Blast 




1.4e-05 






6 




9 
«n 


I? 




00 

cs 


o\ 
«^ 


s; 




00 

cs 


START 
AA 






m 
tn 


<n 


cn 
cs 


cs 


CHAIN 
ID 




o 


Q 


W 


X 








o 
*o 

*— 1 


u 
*o 


i 




1 

1-4 


eg 
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PDB annotation 


INSECT IMMUNITY INSECT 
IMMUNITY, LPS-BINDING, 
HOMOPHILIC ADHESION 


RECEPTOR RECEPTOR, 
SIGNAL TRANSDUCbR OF 
11^6 TYPE CYTOKINES, 
THIRD 2 N-TERMIN AL 
DOMAIN, M 
TRANSMEMBRANE, 11 
GLYCOPROTEIN * ] 


g 

i 

H pQ 

i i i ^ 

U O U H 


nPT T . ADHESION NEURAL 
CELL ADHESION 


CELL ADHESION NEURAL 
CELL ADHESION 


1 CELL ADHESION NEURAL 
CELL ADHESION 


CELL ADHESION NEURAL 
CELL ADHESION 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF, 
FGFR, IMMUNOGLOBULIN^ 
LIKE. SIGNAL ^ 
TRANSDUCTION, 2 1/ 

r>T7^.TiT7.ATION. (5ROWTH 


FACTOR/GROWTH FACTOI^ 
RECHPrOR WM 


GROWTH FACTOR/GROWT^ 
FACTOR RECliFl'OR FGF. ^ 
FGFR, IMMUNOGLOBXJUN-. 
LIKE, SIGNAL ^ 
TRANSDUCTION. 2 g 
DIMERIZATION. GROWTH H" 
FACTOR/GROWTH FACTO]SU 
RECEPTOR fU 
GROWTH FACTOR/GROWT^ 


Compound 


« 

1 

s 




r 

4 
* 


TITIN; CHAIN: NULL; 


< 


1 

\ 


AXONIN-1; CHAIN: A; 


1 AXONIN-li CHAIN: A: I 




AXONIN-1; CHAIN: A; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A.B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: CD; 


FIBROBLAST GROWTH 
FACTOR 2: CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPrORl; 
CHAIN: C.D; 


FIBROBLAST GROWTH 


SEQFOLD 
score 




















PMF 
score 


0.24 


0.04 


0.51 


0.04 


o 


0,13 


0.33 


0.01 


-0.11 




0.36 


Verify 
score 


0.32 


0.20 


-0.07 


-0.14 


-0.00 


1 -0.03 


0.06 


-0.09 


0.00 




0.05 


Psi 
Blast 


? 


CS 


CM 

6 

CO 


CM 


1.7e-34 




CO 
0) 

«o 


! 

to 

OO 


f— 1 

6 

00 




8.56-26 




oo 
oo 

00 




00 

c< 
m 


00 




00 


00 
00 

oo 


VO 
CO 

m 






oo 
oo 

00 


START 
AA 


VO 
v> 






o 
o 

CO 


CO 




o 


rO 


o 




cs 
r* 


CHAIN 
ID 


< 






<; 


< 


< 


< 


u 


U 




a 




Ibih 






lcs6 


NO 

M 

u 


VO 

Vi 

O 


VO 

CA 

u 


Icvs 


Icvs 




Icvs 


SEQID 
NO: 


CO 

(N 


m 

CM 

XT 




CO 
CS 


CO 

3 


5f 


CO 

cs 


m 
(S 






CO 
CM 
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B 

o 

3 







a 
o 

I 

o 





S 



ft 

> « 



CO 



£6 




o 

O* 



CO 

9 



NO 



o 



d 
o 



8 



in 



o 
d 



04 
CM 

d 



8 

IT) 



o 
9 



o 

00 



«*5 



8 
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s 



5' 







2: H Qj J O O 




« p o 



9 - ^ 

< O 




O 

I 







3 W ol 

iiigi 





CQ R OQ U< , 



CO 



s 



^ o 



«j a 



>1 



CM 



CO 



vo 
cn 
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PDB annotation 


GROWTH FACTOK/UKUW i n 
FACTOR RECHFIOR FGFl; 
FGFRl; IMMUNOGLOBULIN 
aQ)LlKli DOMAINS 
BELONGING TO THE I-SBT 2 
SUBGROUP WITHIN IG-LIKE 
DOMAINS. B-TREFOIL FOLD 


IMMUNE SYSTEM t^C- m 
EPSILON RI- ALPHA; !| 
IMMUNOGLOBULIN FOLD, 
GLYCOPROTEIN, 
RECEPTOR, IGE-BINDING 2 
PROTEIN 


IMMUNE SYSTEM HIGH 
AFFINITY IGE-FC 
RECEPTOR, FC(EPSILON) 
IGE-FC; IMMUNOGLOBULIN 
FOLD, GLYCOPROTEIN, 
RECEPTOR, IGB-BINDmG 2 
PROTEIN, IGE ANTIBODY, 
IGE-FC 


IMMUNE SYSTEM HIUH 
AFFINITY IGE-FC 
RECEPTOR, FC(EPSILON) m 
IGE-FC; IMMUNOGLOBULIKE 
FOLD. GLYCOPROTEIN, ^ 
RECEPTOR. IGE-BINDING 2f1 
PROTEIN, IGE ANTIBODY. \ 

IGE-FC q| 


sulfite/! 

55 

6 2 


i 

J O 2 

m 






FIBROBLAST GROWTH 
FACTOR 1; CHAIN: A. B; 
FIBROBLAST GROWTH 
FACTOR RECHKIORI; 
CHAIN: CD; 




HIGH AFFINITY 
IMMUNOGLOBULIN 
EPSILON RECEPTOR 
CHAIN: A; IG EPSILON 
CHAIN C REGION; CHAIN: 
B,D; 


HIGH AFFINITY i 
IMMUNOGLOBULIN 
EPSILON RECEPTOR 
CHAIN: A; IG EPSILON 
CHAIN C REGION; CHAIN: 
B,D; 


k; CHAIN: 


< 


< 


1 

9 
O 

U 


HIGH AFFINITY 
IMMUNOGLOBUI 
EPSILON RECEPT 
CHAIN: A; . 


^1 
2 5^ 

O U v 
S. B. •< 


TELOKIN; CHAB 


5 


score 
































PMF 
score 


0.18 


0.36 


0.63 


0.18 


-0.12 


0.84 


0.75 


Verify 
score 


0.15 


0.09 


0.06 


0.06 ' 


-0.00 


0.31 


0.39 


Psi 
Blast 


oo 


m 
cn 


<s 




9.9©-12 


VO 
f— < 

(£> 


le-09 




ss 

oo 


00 
00 
00 


VO 
00 


00 
00 
00 


i 


O 

? 


VO 

JT 


START 
AA 












»— < 
»n 

CO 


00 

m 
to 


CHAIN 

m 




< 


< 


< 


< 


< 


< 


PDB 






lf6a 


iS 


oo 


00 


tkO 


o 


• 

> 






cn 




9 
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PDB annotation 

GT.YCOPROTEIN 


I 

\ 




KINASE KINASE, SIGNAL 

TRANSDUCTION. 

CALCIUM/CALMODULIN 

TTTVT A Ct3 VTKT A C"P CTnM AT 




TRANSFERASE, 
SHRINE/THREONINE- 
PROTEIN KINASE, CASEIN 
KINASE. 2 SER/THR KINASE 






Compound 


^ 3 CSl 

z is 

1 i 

H H 


LECTIN (AGGLUTININ) 
WHEAT GERM 
AGGLUTININ GSOLECTIN 
2)9WGA3 


CALOOM/CALMODULIN- 
DEPENDENT PROTEIN 
KINASE; CHAIN: NULL; 


iii 

3 9 w 


PROTEIN KINASE 
CK2/ALPHA-SUBUNrr; 
: CHAIN: NULL; 




TRANSPERASEOPHOSPHO 
TRANSFERASE) $C-/AMP$- 
DEPENDENT PROTEIN 
KINASE (E.C.2.7.1.37) 
($C/APK$)1APM3 
(CATALYTIC SUBUNTT) 
ALPHA ISOENZYME 
MUTANT WTTH SHR 139 
1 APM 4 REPLACED BY 
ALA(/S139A$)COMPUEX 
WTTH THE PEPTIDE 1 APM 
5 INHmrrOR PKI(5-24) 
AND THE DHiERGENT 
MEGA-8 1APM6 


TRANSPERASEOPHOSPHO 
TRANSFERASE) $C-/AMP$- 
DEPENDENT PROTEIN 
KINASE (E.C.2.7.1.37) 
(SaAPK$) lAPM 3 


SEQ FOLD 
score 


52.76 


00 


75.71 




73.63 


78.39 i 




PMF 1 


score 








d 






0.92 




score 








-0.14 






0.33 


2 


Blast 


! 

1-4 


1.7e-18 


8.5e-72 


00 


! 

00 


cn 


1 

cn 


END 1 


AA 


m 
oo 


00 


o\ 

CO 


cn 
o\ 
<n 


1 




cn 


START 1 


AA 




1—1 


so 

l-H 




in 
cn 


m 
cn 


o 

«n 




B 












W 






1 


9wga 


la06 


NO 

i-H 


o 


lapm ' 


lapm 


SEQ ID 
NO; 


5 








5 
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PDB annotation 


/ 

1 


PROTEIN KINASE CDK2; 
PROTEIN KINASE, CELL 


C W 00 (I 

£ ^ p S f 


5 1 » 2 S E 

3 P 3 S3 5 3 G9 


s Is 

Orji'5R2 

a! 2 « Pi 

iliiiil 


KINASE), RECEPTOR 2 Q 


i 

m 


^ . 

H 3 2^ Q fcl 

mk 

J sspm q u a 


Compound 


(CATALYTIC SUBUNTT) 
ALPHA ISOENZYME 
MUTANT WTTH SER 139 
1 APM 4 REPLACED BY 
ALA (/S139AS) COMPLEX 
WITH THE PEPTIDE 1 APM 
5INHIBrrORPKI(5-24) 
AND THE DETERGENT 
MEGA-81APM6 


in 


\ 
J 


J SO 


FK506-BINDINO PROTEIN; 
CHAIN;A.C,E.G;TGF-B 


SUPERFAMELY RECEPTOR 
TYPE I: CHAIN: B. D. F. H: 




CYCUN-DEPENDEm 
KINASE 6; CHAIN: A, C; 
CYCLIN-DEPENDENT 
KINASE INHffirrOR; 
CHAIN: B,D; 


SEQFOLD 
score 






x> 


80.91 


87.11 


PMF 1 


score 




0.89 








Verify 
score 




-0.04 








I 


Blast 




o 

■i 


o 

■i 


6.6e-3l 


•2 

oo 
so 






»n 
o\ 
m 




o 


cn 


START 












L 


i 


• 








< 


PDB 

m 




laql 


laql 


o 


00 


e. 

CO 


1 


in 




5 
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§ 
3 



5' 




PC 



in 



o 

I* 

U 



i 

Ed 

CO 



s 



PC4 



a 8 

^ CO 



m CO 



CO 



CO 




o 



* 



s 



94 



2« 

3§ 



M I 






00 

cs 



3 



4 



i 

I 

. - Cj Pj 





o 



31 



GO 

o 



8 
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PDB annotation 




TRANSFERASE KINASE 
DOMAIN, AUTOINHIBITORY 
FRAGMENT. HOMODIMER 


PHOSPHOTRANSFERASE J | 
FGERIK. FIBROBLAST 
GROWTH FACTOR 
RFX'.KPT0R1; 

TRANSFERASE, TYROSINE- 
PROTEIN KINASE, ATP- 
BINDING.2 
PHOSPHORYLATION, 
RECEPTOR. 


PHOSPHOTRANSFERASE 


PHOSPHOTRANSFKRASU 
FGFRIK. FIBROBLAST 
GROWTH FACTOR 
RECEPTOR 1; 

TRANSFERASE. TYROSINE- 
PROTBIN KINASE, ATP- 
BINDING, 2 

PHOSPHORYLATION, Vj 
RECEFTOR, 

PHOSPHOTRANSFERASE % 


J S W ^ 2 C 
^ ^ g § cs ^ pt! ! 

a||||||| 

^ H CO PQ 0 S 1 


j" J 

ii 

^ CO S ^ 


Compound 


(CATALYTIC SUBUNTT) 
1CTP4 


1 


PROTEIN KINASE PAK- 
ALPHA;CHAIN:A.B; 
SERINEmiREONINE- 
PROTEIN KINASE PAK- 
ALPHA; CHAIN: CD; 


FGFRECEPTOR I; CHAIN: 
A,B; 


. FGFRECEPTOR I; CHAIN: 
A,B; 






m 

"1 


HUMAN CYCLIN- 
DEPENDENT KINASE 2; 




SEQFOLD 
score 






84.95 


82.99 




97.32 


PMF 
score 




0.06 






S! 

d 




Veriftr 
score 




-0,05 






3 

d 




Psl 
Blast 




oo 
so 

4) 


i 

NO 


o 
m 

6 
cn 


■i 

00 


00 






o\ 


f— 1 

o 


o 


o> 
m 


? 


START 




OS 


1 

so 


*n 


\o 


VO 


CHAIN 

m 




U 


< 


PQ 






PDB 
m 




B 


Ifgk 


•Si 

1— » 


Ihcl 


Ihcl 


a . 
Si 


i 








m 
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PDB annotation 


(TEINASE YSCE 
lUNTTMACROPAIN 


1UNITPRE3. 

)TEINASEYSCE 

lUNIT PROTEASOME, 

QUrriN. DEGRADATION, 

>TEASE,NTN- 

3ROLASE M 


MULTICATALYTIC 1 
PROTEINASE 
MULTICATALYTIC 
PROTEINASE, 20S 
PROTEASOME, PROTEIN 2 
DEGRADATION, ANTIGEN 
PROCESSING. HYDROLASE, 
PROTEASE 


MULTICATALYTIC 
PROTEINASE 
MULTICATALYTIC 
PROTEINASE, 20S 
PROTEASOME, PROTEIN 2 
DEGRADATION, ANTIGEN 
PROCESSING, HYDROLASE. 
PROTEASE 










oj Lj tij w 


■4 PH 1^ 


s § £ g 




Compound 


PROTEASOME 
COMPONENT PRE3; 


CHAIN: N, 2; 


20S PROTEASOME; 

CHAIN:A,B.C,D,E,F,G, 

H.LJ,BCL,M.N,O.P,Q, 


20S PROTEASOME; 
CHAIN:A,B,C.D.E,F,G. 
H, I, J, K,L,M, N,0,P. Q, 


20S PROTEASOME; 
CHAIN: A.B,C,D.E,F,G, 
H,I,J,K,L,M,N.O,P,Q, 


20S PROTEASOME; 
CHAIN: A.B,C,D3,F,a 
H.I.J.K,L,M,N,0,P, Q. 


SEQ FOLD 
score 




76.38 






66.44 


PMF 
score 






2 


OO 

S 




t £ 








cs 
m 
d 


cn 
d 




1 i 










Psi 
Blast 




1 




1 


1.7e-48 






en 






S 
en 


START 
AA 




so 




VO 
NO 


So 


CHAIN 
ID 






U 










ai 














SEQ ID 
NO: 




00 

9 


00 


00 

9 


00 

5! 
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0500 
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CO 
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00 
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00 
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g 


roER 






COMPLEX (ZINC 
FINGER/DNA) ZINC FINUbR, 
PROTEIN-DNA 
INTERACTION. PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


P g 


COMPLEX (ZINC M 
FINGER/DNA) ZINC FINGHKH 
PROTEIN-DNA \| 
INTERACTION, PROTEIN m 
DESIGN, 2 CRYSTAL 2| 
STRUCTURE, COMPLEX ^ 
(ZINC FINGER/DNA) H 




PDB annotation 


DOMAIN; PROTEIN- 
PROTEIN INTER ACTIC 
DOMAIN, 


TRANSCRIPTIONAL 2 
REPRESSOR, ZENC-I^ 
PROTEIN. X-RAY 
CRYSTALLOGRAPHY, 
PROTEIN STRUCTURI 
PROMYELOCYTIC 
LEUKEMIA. GENE 
REGULATION 


COMPLEX (ZINC 
FINGER/DNA) ZINC Fl 
PROTEIN-DNA 
INTERACTION, PROT] 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPU 
(ZDSrC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC F 
PROTEIN-DNA 
INTERACTION, PROT 
DESIGN, 2t:RYSTAL 
STRUCTURE, COMPL 
(ZINCnNGERiTDNA) 


Compound 


LEUKEMIA ZINC FINGER 
PROTEIN PLZF; CHAIN: A; 


DNA; CHAIN:A,B.D.E; 
CONSENSUS ZINC FINGBER 
PROTEIN; CHAIN: C. F. G; 


DNA;CHAIN:A,B,D.E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: /^B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA;CHAIN:A,B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C. F, G; 


P 




















SEQFOI 
score 




















PMF 
score 




-0.20 


0.22 


-0.14 


0.82 


Verify 
score 




0.11 


vo 
o 

9 


0.07 


-0.01 


Psi 
Blast 




L4e-45 


le-47 


? 

00 


1.4e-ll 








i-i 


<n 
«o 




START 
AA 




m 






CO 


CHAIN 
ID 




U 


a 


O 


O 






Imey 


Imey 


Imey 


Imey 


SEQ ID 
NO: 




o 
9 


o 
m 


o 
m 


o 
m 
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PDB annotation 


COMPLEX (ZINC 
FINGER/DN A) ZINC FINGER. 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZmCFINGERiDNA) 


ZINC FlNUJaR ■ 
TRANSCRIPTION FACTOR '1 
SPl; ZINC FINGER. 
TRANSCRIPTION 
ACri VATION, SPl 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING^ 
YANG 1; TRANSCRIPTION 
DSrmATION, INTTLVTOR 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


TRANSCRIPTION 
REGULATION 


TRANSCRIFllUJN 
REGULATION, ADRl, 21NC 
FINGER,NMR i 




1 1 § 

jiiiili 

o §^ s S S N m 


Compound 


DNA; CHAIN: A,B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C.F.G; 




SP1F2; CHAIN; NULL; 


YYl; CHAIN: QADENO- 
ASSOCIATHD VIRUS P5 
INITIATOR F.T EMENT 
DNA; CHAIN: A, B; 


ADRl; CHAIN: NULL; 


ZINCFINGERDNA I 
BINDING DOMAIN ZINC- 


FINGER (ZFY-iSWAJ:') 
(NMR, 12 STRUCTURES) 
7ZNF3 


CYTOCHROME P450 2C5; 
CHAIN: A; 


SEQFOLD 
score 














PMF 
score 


d 


d 


m 

9 


o 

d 


vo 

d 


8. 


Verify 
score 


d 


o 
d 


oo 
o 
d 


<? 


-0.62 


d 


Psi 
Blast 






le-30 


5.1e-17 


le-05 


o 




m 




R 

to 




vo 
m 




START 
AA 


m 
? 


00 


!^ 


5 


00 




CHAIN 
ID 


O 




u 






< 




Imey 




1 


1 






SEQ ID 
NO: 


o 

ro 


1 


1 


o 

9 


o 
5? 


en 

9 
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9 



III 

u 9 



^9 



o 

1 0-» , 






CP 





a 



ii 



ii 




o 



?8 



JO a 



NO 



00 



NO 
OO 
CS 



o 

OO 



*0 
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<£ 










E 


PDB annotation 


STRUCTURE, COMPLEX 
rZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGEI 
PROTEIN-DNA 
INTERACTION, PROTEIN . 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGEl 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FDSfGE 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 




FINGER/DNA) ZINC FINOE 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE. COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (iywu 
FINGER/DNA) ZINC FINGI 
PROTEIN-DNA 
INIERACTION. PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 
COMPLEX fZINC 


Compound 




DNA; CmiN: A, B, D. E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A,B,D,E; 
CONSENSUS ZINC FINGKR 
PROTEIN; CHAIN: CF,G; 


DNA;CHAIN:A,B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA;CHAIN:A,B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F. G; 


DNA;CHAIN:A.B.D,E; 
CONSENSUS ZINC FINGHK 
PROTEIN; CHAIN: C, F, G; 




DNA; CHAIN; A, B, D,E; 


SEQ FOLD 
score 














PMF 

score 




0.42 


0.39 


0.83 


0.39 


o 




0.86 


Verify 
score 




o 
o 


-0.23 


-0.40 


-0.15 


-0.19 
0.03 


Psi 
Blast 




o 

1 


! 


6.8e-35 




5.1e-42 
3.4e-45 


is 




»o 


00 






o 

8 • S 


START 
AA 




1— ♦ 


5? 
»o 


00 

S3 


OO 


SI ^ 


CHAIN 




U 


U 


u 


U 


o o 


PDB 




Imey 


Imey 


1 


1 

i-H 


t 1 


li 


5 


CO 


en 


w-j 

CO 


»n 
cn 


3 
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PDB annotation 


5S RNA 2 GENE. DNA 
BINDING PROTEIN. ZINC 
FINGER. COMPLEX 3 
(TEIANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTI01«| 
REGULATION/DNA), RNA \l 
POLYMERASE m. 2 
TRANSCRIPTION 
INITL^TION, ZINC FINGER 
PROTEIN- V 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 
YANG 1; TOANSCRIPTION 
INITIATION, INITIATOR 
FT .FMENT. YYl. ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 
YANG 1: TRANSCRIPTION R 
JNTTLVTION.lNrrL^TOR • 
ELEMENT. YYl. ZINC 2 H 
FINGER PROTEIN. DNA- 
PROTEIN RECOGNITION. 3 mQ 
COMPLEX (TRANSCRIPTIOif 
REGULATION/DNA) O 


COMPLEX (TRANSCRIPTIC^ 
REGULATION/DNA) YING-\ 
YANG 1; TRANSCRIPTION g 
INITL\TION,INrnATOR ^ 
ELEMENT, YYl. ZINC 2 ^ 
FINGER PROTEIN, DNA- fU 
PROTEIN RECOGNITION, 3fU 
COMPLEX (TRANSCRIPTICIS 


Compound 




TFinA;CHAIN:A,D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,CE.F; 


YYl: CHAIN: C;ADENO- 
ASSOCLVTED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 
ASSOCLATED VIRUS P5 
nSHTDVTOR FT ,F,MENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 
ASSOCL^THU VIRUS P5 
. INTTL^TOR El P.MENT 
DNA; CHAIN: A, B; 


SEQ FOLD 
score 












PMF 
score 




0.09 


0.80 


0.60 


0.36 


Verify 
score 




-0.24 


-0.34 


-0.45 




Psi 
Blast 




le-29 


? 

i-H 


in 


cs 

00 








m 
vo 

CO 




i 


START 
AA 












CHAIN 
ID 




< 


U 


a 


U 


la 




NO 


lubd 


lubd 


lubd 


SEQID 
NO: 




9 


m 

9 


m 

CO 


lO 

9 
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CO 
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5 



3 



I 



1^ 






CO 
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vo 
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CM 



Si 



00 
00 
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o 

I 

s 

8 

g 



i 



o 

!• 

o 
U 



8 




o 



s 

CO 



Q-* to 



CO 

5 



m 
O 
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CM 



Id 



cr 



S S 



cr 





CO 
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Table 6 



SEQJD NO: 


Position of Signal 
Peptide 


Maximum score 


Mean score 


1 


24 


0.978 


0.760 


2 


32 


0.995 


0.681 


3 


37 


0.979 


0.718 


4 


18 


0.925 


0.822 


5 


28 


0.939 


0.749 


6 


41 


0.989 


0.690 


7 


26 


0.960 


0.674 


8 


16 


0.973 


0.925 


9 


24 


0.978 


0.760 


10 


18 


0.887 


0.579 


11 


42 


0.977 


0,587 


12 


21 


0.966 


0.848 


13 


25 


0.993 


0.954 


14 


28 


0.909 


0.664 


16 


23 


0.913 


0.597 


17 


42 


0.978 


0.689 


18 


21 


0.930 


0.662 


19 


45 


0.985 


0.714 


20 


37 


0.992 


0.855 


21 


31 


0,947 


0,775 


22 


20 


0.979 


0.911 


24 


30 


0.924 


0.720 


25 


26 


0.974 


0.824 


26 


28 


0.982 


0.649 


28 


16 


0.912 


0.705 


29 


27 


0.957 


0.652 


30 


22 


0.968 


0.844 


31 


23 


0.952 


0.812 


32 


18 


0.932 


0.884 


33 


29 


0.991 


0.729 


34 


26 


0.939 


0.709 


35 


29 


0.961 


0.842 


36 


16 


0.951 


0.777 


37 


27 


0,983 


0.898 


38 


17 


0.991 


0.955 


39 


33 


0.977 


0.822 


40 


17 


0.989 


0.969 


41 


30 


0.936 


0.679 


42 


24 


0.993 


0.810 


44 


22 


0.990 


A A1 1 

0.921 


54 


18 


0.925 


0.822 


56 


18 


0.981 


0.951 


60 


28 


0.939 


0.749 


62 


33 


0.979 


0.757 


70 


41 


0.989 


0.690 


79 


26 


0.960 


0.674 


83 


18 


0.979 


0.963 


84 


22 


0.967 


0.792 


87 


25 


0.980 


0.867 


97 


16 


0.973 


0.925 


98 


24 


0,978 


0.760 


99 


17 


0.978 


0.925 
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Table 6 



SEQ.ID NO: 


Position of Signal 
reptiae 


Maximum score 


Mean score 


113 


1 o 

lo 


U.<5o/ 




115 


1 o 
lo 






120 


42 


u.y // 


U.Do/ 


137 


21 


0.9oo 




140 


25 


0.993 


r» AC/1 
U.y54 


153 


28 


0.909 


A <fiA 
U.OW 


156 


18 


0.954 


0.747 


174 


23 


0.913 


0.j97 


175 


20 


0.9oo 


A OK 


178 


42 


0,978 


A £Qn 

U.oo9 


180 


32 


0.929 




184 


21 


A AOA 

0.979 


A Ail 1 

0.941 


192 


21 


A AOA 

0.930 


0.002 


200 


45 


0.985 


0.714 


212 


37 


0.992 


A OCf 

0,855 


225 


24 


0.971 


A oon 


228 


20 


0.979 


A A4 1 

0.911 


237 


17 


A AOO 

0.982 


0.9o4 


251 


13 


0.918 


A £At 

0.692 


252 


13 


0.918 


A iTAO 

0.692 


256 


20 


0.912 


A £.f\*i 

0.693 


257 


20 


0.912 


A ^AO 

0.693 


260 


26 


0.974 


A CiA 

0,824 


262 


18 


0.965 


A ©"JO 

0.833 


267 


25 


0.956 


0.765 


288 


16 


0.912 


0.705 


289 


18 


0,896 


0.634 


290 


19 


0.966 


A OAT 


294 


18 


0.991 


A ATO 

0.973 


295 


20 


A AA^ 

0.906 


A COA 

0.580 


299 


27 


0.957 


0.052 


307 


19 


A AOO 

0.983 


A 0*71 

0.871 


310 


22 




U.o44 


320 


23 


A A<0 




324 


27 


A AOO 

iJ.yoZ 


A 01 1 


327 


1 o 
18 


o.yoi 


A O/f 1 


32o 


1 Q 

lo 






332 


2/ 


A AAA 


n 00^ 


335 








•9 JO 




O 983 


0.793 


346 


29 


0.991 


0.729 


354 


22 


0.978 


0.877 


363 


26 


0.939 


0.709 


364 


22 


0.966 


0.843 


375 


29 


0,961 


0.842 


379 


16 


0,951 


0.777 


401 


44 


0.975 


0.876 


407 


33 


0.977 


0.822 


417 


17 


0.989 


0.969 


418 


23 


0,974 


0.799 


422 


18 


0.981 


0.952 


426 


21 


0.982 


0.912 
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Tabled 



SEQJDNO: 


Position of Signal 
Peptide 


Maximum score 


Mean score 


428 


30 


0,936 


0.679 


429 


43 


0.978 


0.712 


433 


28 


0.993 


0.948 


434 


43 


0.930 


0.624 


437 


24 


0.993 


0.810 


438 


16 


0.978 


0.939 
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Table? 



SEQIDNO: 


Chromsomal location 


3 


2qll.2 


4 


20pter-pl2.3 


5 


5q31 


6 


19pl2 


7 


19pl2 


8 


5 


11 


12pl3-pl2 


12 


pi 1,2-12.3 


13 


19p 


14 


6pl2.1-2hl 


15 


19pl3.1 


17 


16ql2-'ql3 


19 


15 


20 


15 


22 


Xql3.1 


23 


12 


25 


llpl5.5 


26 


20 


27 


22 


28 


12q23-24.1 


29 


20 


30 


13 


31 


12 


33 


15 


36 


4^28 


37 


14q24.3 


38 


10 


39 


20 


41 


I7ql2-q21 


42 


14 


44 


lq24.1-25.2 


45 


2 


47 


3q21-q25 


48 


9 


49 


14 


50 


6ql4.1-15 


51 


19 


52 


11 


53 


20 


54 


16 


55 


14 


56 


3 


57 


19 


58 


7pl5.1-pl3 


59 


19 


61 


2 


62 


19 


63 


16 


66 


15 


70 


lp3Ll-33 


71 


9 


72 


16 
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Table? 



SEQIDNO: 


Chromsomal location 


74 


5q31-q33 


75 


3p2I.l-ql3.l3 


76 


2 


77 


2 


78 


21q22.1 


79 


Xpll.22-pll.21 


80 


2 


81 


19 


82 


20 


83 


19pl3.3 


84 


19 


85 


3 


86 


8 


87 


lpl3 


88 


16 


89 


18q2Ll-q22 


90 


Ilql3.1-ql3.3 


91 


18pll.23-pll^l 


92 


17 


93 


10 


94 


3 


95 


X 


96 


6q 14.2-16.1 


97 


lq21.2-22 


98 


lq21.2-22 


99 


6 


102 


8q22-q23 


103 


lOpll.2 


104 


17 


• 105 


17 


106 


2 


107 


1 


108 


16 


109 


17q21.3-q22 


110 


llq 


111 


3p2Ll-ql3.13 


112 


16 


113 


5 


114 


9 


115 


3pl3-q26.1 


116 


5 


117 


7q31 


118 


14 


119 


14 


120 


19 


121 


19 


122 


6q27 


123 


14 


124 


Iq21-q22 


125 


6 


126 


17q25 


127 


15 
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Table? 



SEQ ID NO: 


Chromsomal location 


129 


14q31 


130 


lp3o.l 


131 


11 


132 


OA 


133 


20pll.23-p 11.21 


134 


lp32 


135 


2q31 


136 


X 


138 


12pl3 


139 


9 


140 


p34.1-34.3 


141 


19ql2 


142 


15q26 


143 


22qll.21 


144 


17ql2 


145 


4pl6.3 


146 


22 


147 


16pll.2 


148 


I8ql2 


150 


4 


151 


7pl2-qll.21 


152 


14 


153 


14q32.33 


155 


lp34 


156 


16pl3.3 


157 


12pl3.3 


158 


5 


159 


8 


160 


in 

19 


161 


4 


162 


1 


163 


llq23 


164 


3 


165 


12q22 


168 


19 


170 


1 


171 


18ql2 


173 


7 


174 


13 


i/^ 




176 


16 


178 


10 


179 


Iq21-q25 


180 


19D133 


181 


1 


184 


lp35.1-36^3 


185 


1 


186 


18 


187 


3pl3-q26.1 


188 


3 


189 


17 


190 


6 
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SEQIDNO: 


Chromsomal location 


193 


llpl5.5 


194 


14q32 


195 


12 


196 


10q24 


198 


lp36.1 


199 


5q22 


200 


11 


201 


2q31 


202 


17 


206 


Xpll.23 


207 


9q34 


208 


19 


209 


20 


210 


llq23 


211 


16pl2 


212 


19ql3.1 


213 


7pl5 


214 


15 


215 


lp36.21-36.33 


216 


11 


217 


22qlL2 


218 


15 


219 


19ql3.4 


222 


19 


223 


lq25.2 


226 


1 


227 


lp36.11~36.23 


228 


Ip36.3-p36.13 


230 


17 


231 


7q33-q34 


232 


3 


233 


9 


234 


10 


235 


17 


236 


4 


237 


19ql3.4 


238 


4q25 


239 


2 


240 


7 


241 


12 


243 


6p21.3 


244 


3p13k|26.1 


245 


17 


246 


lp34.1 


247 


3q23 


248 


3p21.3 


249 


20 


250 


20 


251 


18ql2-q21 


252 


18ql2-q21 


253 


14 


254 


Ip35.3-p35.1 
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Table? 



S£Q ID NO: 


Chromsomal location 


256 




257 


oq25-q2o 


258 


Iq21-q23 


259 


lopl3.2-lopl3.1i 


260 


14q21.1-q24.1 


261 


2p23.3-q32.3 


262 


12 


263 


19 


264 


4q28 


265 


2 


266 


2 


267 


Iq21-q23 


268 


20pl2.3-pl3 


269 


4 


270 


6 


271 


2p233-ql43 


272 


18q21 


273 


18q21 


274 


14q22 


275 


6p21.3 


276 


5 


280 


8 


281 


4q22-q24 


282 


2 


283 


7q22-q31.1 


284 


11 


285 


llql2.3 


286 


10 


287 


19 


290 


17 


291 


4q22 


292 


ip36. 11-36.23 


293 


19 


294 


22 


296 


3 


297 


4pl6 


298 


o 


299 


oql3 


300 


2U 






302 


22qll.2-q22 


303 


15 


304 


6 


306 


6 


307 


9p24.2 


308 


2p23.3-q24.3 


309 


14 


310 


6 


311 


2 


312 


4 


313 


19pter-19pl3.3 


314 


3 
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c^iiroiiisoni&i locanon 


J 10 








118 


17 
1 / 




17 


jZU 


5ql4 




4 


324 


It* 

3p 


323 


OpZi.l*-Zi.3l 


32o 


17pll.2 


327 


y 


328 


5q23 


329 


2 


330 


3 


331 


lp2 1.1-22. 1 


332 


9 


333 


7 


334 


llql3 


337 


14 


338 


7q35-q3o 


339 


13 


340 


oq 11.1-22,33 


341 


11 — lO —11 1 

Ilql2-ql3.1 


343 


1 A 


344 


10 


345 


10 


34o 


1 1 rtOO 

iiqzz 


34/ 


10 


348 


i5qZ4'q2D 


350 


Yf-il 1 Ol 11 07 
Apl l./l-l l.ZZ 


354 


10 


355 


10 

ly 


350 


11 


ICQ 

35o 






A 
*f 


30U 


Q 

o 


3oZ 


A 


303 


11 


3o4 


1 l/ill 

llylj 


3o5 


7«11 

/q31 




22al3 31-13 32 


367 


5 


370 


19 


371 


7q31.1-7q3L33 


372 


2q37.3 


373 


3 


374 


16 


375 


19ql3.4 


376 


18ql2 


377 


18ql2 


379 


8 


380 


llql3 


381 


6 
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Table? 



SEQ ID NO: 


Chromsomal location 


385 


4q28 


386 


15 


387 


10 


388 


17 


389 


llpl5.4 


390 


6p21.3 


391 


22ql3 


392 


3 


393 


19 


394 


15 


395 


1 


396 


6p2L2-p21.3 


397 


15 


399 


7q31 


400 


14 


402 


Xq28 


403 


10 


404 


16 


406 


16 


408 


11 


412 


20q 12-13.1 


413 


15 


414 


17 


415 


4 


416 


12q 


419 


21q22.1 


420 


16pll^ 


422 


6 


AHA 




426 


14 


428 


14 


429 


lq22-c|23 


430 


llql3 


431 


3 


432 


2 


433 


19ql3.1 


434 


20ql3.1 


435 


18q23 


436 


llq24 


437 


10 


438 


4q21-q25 
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SEQ ID NO: of Full-length 
Nucleotide Sequence 


SEQ ID NO: of Full-length 
Nucleotide Sequence 


SEQ ID NO: in Priority Application 
USSN 09/774,528 


52 


52 


54 


53 


53 


55 


54 


54 


56 


55 


55 


57 


56 


56 


58 


57 


57 


59 


58 


58 


60 


59 


59 


61 


60 


60 


62 


61 


61 


63 


62 


62 


64 


63 


63 


65 


64 


64 


66 


65 


65 


67 


66 


66 


68 


67 


67 


69 


68 


68 


70 


69 


69 


71 


70 


70 


72 


71 


71 


73 


72 


72 


74 


73 


73 


75 


74 


74 


76 


75 


75 


77 


76 


76 


78 


77 


77 


79 


78 


78 


80 


79 


79 


81 


80 


80 


82 


81 


81 


83 


82 


82 


84 


83 


83 


85 


84 


84 


86 


85 


85 


87 


86 


86 


88 


87 


87 


89 


88 


88 


.90 


89 


89 


91 


90 


90 


92 


91 


yi 


yj 


92 


92 


94 


93 


93 


95 


94 


94 


96 


95 


95 


97 


96 


96 


98 


97 


97 . 


99 


98 


98 


100 


99 


99 


101 


100 


100 


102 


101 


101 


103 


102 


102 


104 


103 


103 


105 
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otL\l iU JNviS 01 rlIIM6ngtll 

AA^fl^ A CaAVI AM AA 

nucieouae oeQucnce 


oJd V ''^''^ vi r uii-'iengui 
• iiucicoiiue oci|Ucnc6 


oJ!#v -'•^ iSijz 111 i^rioricy Appiicanon 


in>i 


1U4 


lUO 


1 

lUj 




1 AT 
lU/ 




1 A< 
lUO 


1 AQ 

lUo 


1 A*? 
lU/ 


lu/ 


1 AO 


1 AO 


1 AO 


1 1 A 
1 lU 


109 


1 AA 

109 


111 


110 


1 i A 

110 


112 


111 


111 


113 


111 
112 


112 


114 


113 


113 


115 


114 


114 


116 


115 


115 


117 


116 


116 


1 18 


117 


117 


119 


118 


118 


120 


119 


1 1 A 

119 


121 


120 


120 


122 


121 


121 


123 


122 


122 


124 


123 


123 


1 

12 J 


124 


124 


126 


125 


125 


127 


12o 


126 


12o 


12/ 


127 


1 ')A 

129 


IZo 


125 


1 1A 

13U 




1 OA 

129 


131 


130 


n A 
130 


132 


131 


111 

131 


133 


132 


too 
132 


134 


133 


133 


lie 
13!) 


134 


134 


13o 


135 


IOC 

135 


1 1*7 

13 / 


lit) 


13o 


130 


13/ 


13/ 


no 

i3y 


13o 


138 


1 >1 A 

14U 


139 


1 1Q 

139 


1 >i 1 
141 


1 ilA 


14U 


1/10 

14/ 


141 


141 


1 A1 
143 


14Z 


142 


i»r*l 


143 


143 


145 


144 


144 


146 


145 


145 


147 


146 


146 


148 


147 


147 


149 


148 


148 


150 


149 


149 


151 


150 


150 


152 


151 


151 


153 


152 


152 


154 


153 


153 


155 


154 


154 


156 


155 


155 


157 
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btixi ID MU: 01 f ull-length 
Nucleotide Sequence 


dUQ iU INO: of Full*lengtn 
Nucleotide Sequence 


SEQ ID NO: in Priority Application 

USbIN 09/774,528 


IjO 


130 


1 CO 

158 


1 J/ 


15/ 


1 CA 

159 


1 Cfi 
IJO 


ICO 

138 


loO 


1 CO 


1 CA 

159 


161 


loO 


1 ^A 

160 


162 


161 


161 


163 


162 


162 


164 


163 


163 


165 


164 


164 


166 


165 


165 


167 


166 


166 


168 


167 


167 


169 


168 


168 


170 


169 


169 


171 


170 


170 


172 


171 


171 


173 


172 


172 


174 


173 


173 


175 


174 


174 


176 


175 


175 


177 


176 


176 


178 


177 


177 


179 


178 


178 


180 


179 


179 


181 


1 OA 

180 


180 


182 


lol 


1 O 1 

181 


183 


1 CO 

loi 


1 oo 

182 


1 O il 

184 




183 


185 


1 OA 

lo4 


1 OA 

184 


186 


IOC 

loD 


IOC 

185 


1 D*T 

187 


loo 


186 


1 oo 

188 


187 


187 


189 


188 


1 CO 

188 


1 A A 

190 


185^ 


t OA 

189 


1 A1 

191 


190 


1 A A 

190 


1 A«l 

192 


191 


1 Al 

191 


193 


192 


1 A*% 

192 


1 Ail 

194 


1 AO 

193 


1 A<) 

193 


t A^ 

19:> 


1 AjI 

194 


1 A^ 

194 


1 A^ 

196 


17*? 






196 


196 


198 


197 


197 


199 


198 


198 


200 


199 


199 


201 


200 


200 


202 


201 


201 


203 


202 


202 


204 


203 


203 


205 


204 


204 


206 


205 


205 


207 


206 


206 


208 


207 


207 


209 
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SEQ ID NO: of FuH-lengtn 
Nucleotide Sequence 


htLif ID NO: of rulMengtn 
Nucleotide Sequence 


ID NO: in Priority Application 

TTGGM nt%ntA CtQ 




ZUo 


oi A 
ZIU 


209 


20V 


oil 


1 t A 

210 


O f A 

210 


111 
212 


Oft 

211 


21 1 


oil 


212 


212 


214 


oil 
213 


213 


21j 


214 


214 


21o 


215 


215 


217 


216 


216 


218 


217 


217 


219 


218 


218 


220 


219 


219 


221 


220 


220 


222 


221 


221 


223 


222 


222 


224 


223 


223 


225 


224 


224 


226 


225 


225 


227 


226 


226 


228 


227 


227 


229 


228 


228 


230 


229 


229 


231 


230 


230 


232 


* 231 


231 


233 


232 


232 


234 


233 


233 


235 


234 


234 


zJo 


235 


235 


zi / 


236 


236 


23 o 


237 


237 


Z3y 


238 


23 o 


Z4U 


239 


239 


241 


240 


240 


242 


241 


241 


z4o 


242 


z42 


'yAA 
z*f4 


243 


243 




. 244 


244 


240 


245 


245 


24/ 


z40 


240 


9Afi 


947 


947 
At/ 


249 


248 


248 


250 


249 


249 


251 


250 


250 


252 


251 


251 


253 


252 


252 


254 


253 


253 


255 


254 


254 


256 


255 


255 


257 


256 


256 


258 


257 


257 


259 


258 


258 


260 


259 


259 


261 



603 



wo 02/081731 



PCT/US02/01222 



Table 8 



SEQ ID NO: of FulMengtn 
Nucleotide Sequence 


oJciij JJi NO: oi r ull-iengtn 
Nucleotide Sequence 


oJLi^ iXj in if: in rnoriiy Appucaxion 

uooi^ uy/ / /*t^340 




ZOU 


4U4 




401 






ZOZ 


40*T 






4Q^ 


Z04 




4uO 


Zoo 


Z03 


40/ 


400 


200 


40o 


2o7 


2o7 


4oy 


2o8 


20o 


97A 
4 / v/ 


269 


269 


4/1 


270 


IT A 

270 


4/2 


271 


271 


£lo 


272 • 


272 


274 


273 


273 


275 


274 


274 


276 


275 


275 


277 


276 


276 


278 


277 


277 


2i9 


278 


278 


OOA 

280 


279 


279 


281 


280 


280 


282 


281 


281 


283 


282 


282 


4o4 


283 


283 


40D 


284 


284 


4o0 


285 


toe 
283 


4o/ 


28o 


2oo 


OQfi 
400 


287 


.ion 
2o/ 


4oy 


250 


2oo 


9on 
zyu 


2o7 


2o7 


901 

47 i 


290 




909 

zyz 


291 


2y 1 


907 

47^ 


2^2 




904 

47*t 


2yJ 


2yJ 


90S 

47^ 


OOil 

2y4 


2!r*t 


47D 


29j 




907 
47/ 


290 


4^0 


90ft 
4yo 


29/ 


10*7 


900 
4y7 


29o 


470 




299 


299 


301 


300 


300 


302 


301 


301 


303 


302 


302 


304 


303 


303 


305 


304 


304 


306 


305 


305 


307 


306 


306 


308 


307 


307 


309 


308 


308 


310 


309 


309 


311 


310 


310 


312 


311 


311 


313 
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S£Q ID NO: of FulHength 
Nucleotide Sequence 


S£Q ID NO: of FulMength 
Nucleotide Sequence 


SEQ ID NO: in Priority Application 
USSN 09n74,528 


312 


312 


314 


313 


313 


315 


314 


314 


316 


315 


315 


317 


316 


316 


318 


317 


317 


319 


318 


318 


320 


319 


319 


321 


320 


320 


322 


321 


321 


323 


322 


322 


324 


323 


323 


325 


324 


324 


326 


325 


325 


327 


326 


326 


328 


327 


327 


329 


328 


328 


330 


329 


329 


331 


330 


330 


332 


331 


331 


333 


332 


. 332 


334 


333 


333 


335 


334 


334 


336 


335 


335 


337 


336 


336 


338 


337 


337 


339 


338 


338 


340 


339 


339 


341 


340 


340 


342 


341 


341 


343 


342 


342 


344 


343 


343 


345 


344 


344 


346 


345 


345 


347 


346 


346 


348 


347 


347 


349 


348 


348 


350 


349 


349 


351 


350 


350 


352 


351 


TCI 

351 


353 


352 


352 


354 


353 


353 


355 


354 


354 


356 


355 


355 


357 


356 


356 


358 


357 


357 


360 


358 


358 


361 


359 


359 


362 


360 


360 


363 


361 


361 


364 


362 


362 


365 


363 


363 


366 
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SEQ ID NO: of Full-length 
Nucleotide Sequence 


SEQ ID NO: of FulNength 
Nucleotide Sequence 


SEQ ID NO: in Priority Application 
USSN 09/774,528 


364 


364 


367 


365 


365 


368 


366 


366 


369 


367 


367 


370 


368 


368 


371 


369 


369 


372 


370 


370 


373 


371 


371 


374 


372 


372 


375 


373 


373 


376 


374 


374 


377 


375 


375 


378 


376 


376 


379 


377 


377 


380 


378 


378 


381 


379 


379 


382 


380 


380 


383 


381 


381 


384 


382 


382 


385 


383 


383 


386 


384 


384 


387 


385 


385 


388 


386 


386 


3S9 


387 


387 


390 


388 


388 


391 


389 


389 


392 


390 


390 


393 


391 


391 


394 


392 


392 


395 


393 


393 


396 


394 


394 


397 


395 


395 


398 


396 


396 


399 


397 


397 


400 


398 


398 


401 


399 


399 


402 


400 


400 


403 


401 


401 


404 


402 


402 


405 


403 


403 


406 


404 


404 


407 


405 


405 


408 


406 


406 


409 


407 


407 


410 


408 


408 


411 


409 


409 


412 


410 


410 


413 


411 


411 


414 


412 


412 


415 


413 


413 


416 


414 


414 


417 


415 


415 


418 
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SEQ ID NO: of Full-length 
Nucleotide Sequence 


SEQ ID NO: of Full-length 
Nucleotide Sequence 


SEQ ID NO: in Priority Application 
USSN 09/774,528 


416 


416 


419 


417 


417 


420 


418 


418 


421 


419 


419 


422 


420 


420 


423 


421 


421 


424 


422 


422 


425 


423 


423 


426 


424 


424 


427 


425 


425 


428 


426 


426 


429 


427 


427 


430 


428 


428 


431 


429 


429 


432 


430 


430 


433 


431 


431 


434 


432 


432 


435 


433 


433 


436 


434 


434 


437 


435 


435 


438 


436 


436 


439 


437 


437 


440 


438 


438 


441 
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WHAT IS CLAIMED IS: 

1. An isolated polynucleotide comprising a nucleotide sequaice selected from the 
group consisting of SEQ ID NO: 1-438, a mature protein coding portion of SEQ ID NO: 
M38, an active domain coding portion of SEQ ID NO: 1-438, and complementary 
sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide hybridizes to the polynucleotide of claim 1 under stringent 
hybridization conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide has greater than about 90% sequence identity with the 
polynucleotide of claim 1. 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically ^gineered to comprise the polynucleotide of claim 1. 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 
operatively associated with a regulatory sequence that modulates expression of the 
polynucleotide in the host cell. 

10. An isolated polypeptide, wherein the polypeptide is selected from the group 
consisting of: 
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(b) 



(a) 



a polypeptide encoded by any one of the polynucleotides of 
claim 1; and 

a polypeptide encoded by a polynucleotide hybridizing under 
stringent conditions with any one of SEQ ID NO: 1-438. 



11. A composition comprising the polypeptide of claim 1 0 and a carrier* 

12. An antibody directed against the polypeptide of claim 1 0. 

13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that bmds to and forms a 
complex with the polynucleotide of claim 1 for a period sufBcient to form the complex; 
and 

b) detecting the complex, so that if a complex is detected, the 
polynucleotide of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: . 

a) contacting the sample under stringent hybridization conditions 
with nucleic acid primers that anneal to the polynucleotide of claim 1 under such 
conditions; 

b) amplifying a product comprising at least a portion of the 
polynucleotide of claim 1 ; and 

c) detecting said product and thereby the polynucleotide, of claim 1 in 

the sample. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the 
method fiirfher comprises reverse transcribing an annealed RNA molecule into a cDNA 
polynucleotide. 

16. A mefliod for detecting the polypeptide of claim 10 in a sample, comprising: 
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a) contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex; and 

b) detecting formation of the complex, so that if a complex formation 
is detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 imder 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound 
complex is detected, a compound that binds to the polypeptide of claim 10 is identified. 

18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a 
cell, under conditions sufficient to form a polypeptide/compound complex, wherein the 
complex drives expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence 
expression, so that if the polypeptide/compound complex is detected, a compound that 
binds to the polypeptide of claim 1 0 is identified. 

19. A method of producing the polypeptide of claim 1 0, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected 
firom SEQ ID NO: 1-438, a mature protein coding portion of SEQ ID NO: 1-438, an 
active domain coding portion of SEQ ID NO: 1-438, complementary sequences thereof 
and a polynucleotide sequence hybridizing under stringent conditions to SEQ ID NO: 1- 
438, under conditions sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide firom the cell culture or cells of step (a). 
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20. An isolated polypeptide comprising an amino acid sequence selected from the 
group consisting of any one of the polypeptides encoded by SEQ ID NO: 1-438, the 
mature protein portion thereof, or the active domain thereof. 

21 . The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide 
array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO: 1-438. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid 
array. 

24. The collection of claim 23, wherein the array detects Mi-matches to any one of 
the polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of 
the polynucleotides in the collection. 

26. The collection of claim 22, wherein the collection is provided in a compute- 
readable format. 

27. A method of treatment comprising administering to a manmisilian subject in need 
thereof a therapeutic amount of a composition comprising a polypq)tide of clairn 10 or 20 
and a pharmaceutically acceptable carrier. 

28. A method of treatment comprising administering to a manunalian subject in need 
thereof a therapeutic amoimt of a composition comprising an antibody that specifically 
binds to a polypq>tide of claim 10 or 20 and a pharmaceutically acceptable carrier. 
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